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Joun W. Tukey, Princeton University 


HIs is the report of a committee appointed by the Commission on 
Statistical Standards of the American Statistical Association to 
review the statistical methods used in Serual Behavior in the Human 
Male. We shall refer both to the book and to its authors (Kinsey, 
Pomeroy and Martin) as KPM. The committee wishes to emphasize 
that this report is confined to statistical methodology, and does not 
concern itself with the appropriateness or the limitations of orgasm 
as & measure of sexual behavior. The treatment of specific problems 
has necessitated an examination of some of the statistical and method- 
ological problems of such studies, and the organization of frames of 
reference in which the statistical methods can be discussed. The com- 
mittee hopes that both detailed and general considerations will be of 
service to Dr. Alfred C. Kinsey and his co-workers; to the National 
Research Council’s Committee for Research on Problems of Sex, who 
requested the appointment of this committee; and to others facing 
similar statistical or methodological problems. 
We have endeavored to write this report in a way that would mini- 
mize the possibility of misunderstanding. To do this, it is necessary to 
* This article consists of the main text, but not the appendices, of the report of a committee ap- 
pointed in 1950 by S. S. Wilks as President of the American Statistical Association, to review the sta- 
tistical methods used by Alfred C. Kinsey, Wardell B. Pomeroy, and Clyde E. Martin in their Sexual 
Behavior in the Human Male (Philadelphia, W. B. Saunders Co., 1948). For further details on the ap- 
pointment of the committee and its charge, see Section 1, p. 676 below. For an outline of the appendices, 
as well as of this paper, see Section 3, pp. 678-81. Appendix G, “Principles of Sampling,” will appear 


as an article in the March issue of this JourRNau. The full report, including both the text given here 
and the appendices, will be published as a monograph by the American Statistical Association in 1954. 


673 








674 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


deal with many detailed aspects of the work, one at a time. By judicious 
selection of topics and attitudes, it would have been possible to write 
two factually correct reports, one of which would leave the impression 
with the reader that KPM’s work was of the highest quality, the other 
that the work was of poor quality and that the major issues were 
evaded. We have not written either of these extreme reports. 

Even within the present report, a reader who is trying only to sup- 
port his own opinions could select sections and topics to buttress either 
view. In the details of this report the reader will find numerous prob- 
lems that we feel KPM handled admirably. If he pays attention only 
to these, he would find support for the opinion that the work is nearly 
impeccable and that the conclusions must be subtantially correct. 
There are other problems which we believe KPM failed to handle ade- 
quately, in some cases because they did not devote the necessary skill 
and resources to the problems, in other cases because no solutions for 
the problems exist at present. The reader who concentrates only on the 
parts of our report in which such problems are discussed would find 
support for the opinion that KPM’s work is of poor quality. 

Our own opinion is that KPM are engaged in a complex program of 
research involving many problems of measurement and sampling, for 
some of which there appear at the present to be no satisfactory solu- 
tions. While much remains to be done, our overall impression of their 
work to date is favorable. 

Many details are discussed in the body and appendices of this report. 
The main conclusions are as follows: 

1. The statistical and methodological aspects of KPM’s work are 
outstanding in comparison with other leading sex studies. In a com- 
parison with nine other leading sex studies (four supported in part 
by the same NRC Committee) KPM were superior to all others in 
the systematic coverage of their material, in the number of items which 
they covered, in the composition of their sample as regards its age, 
educational, religious, rural-urban, occupational, and geographic repre- 
sentation, in the number and variety of methodological checks which 
they employed, and in their statistical analyses. So far as we can judge 
from our present knowledge, or from the critical evaluations of a num- 
ber of other qualified specialists, their interviewing was of the best. 

2. KPM’s interpretations were based in part on tabulated and statis- 
tically analyzed data, and in part on data and experience which were 
not presented because of their nature or because of the limitations of 
space. Some interpretations appear not to have been based on either 
of these. We feel that unsubstantiated assertions are not in themselves 
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inappropriate in a scientific study. The accumulated insight of an 
experienced worker frequently merits recording when no documenta- 
tion can be given. However, KPM should have indicated which of their 
statements were undocumented or undocumentable and should have 
been more cautious in boldly drawing highly precise conclusions from 
their limited sample. 

3. Many of KPM’s findings are subject to question because of a 
possible bias in the constitution of the sample. This is not a criticism 
of their work (although it is a criticism of some of their interpretations). 
No previous sex study of a broad human population known to us, medi- 
cal, psychiatric, psychological, or sociological, has been able to avoid 
this difficulty, and we believe that KPM could not have avoided the 
use of a nonprobability sample at the start of their work. Something 
may now perhaps be done to study and reduce this possible bias, by a 
probability sampling program. 

In our opinion, no sex study of a broad human population can expect 
to present incidence data for reported behavior that are known to be 
correct to within a few percentage points. Even with the best available 
sampling techniques, there will be a certain percentage of the popula- 
tion who refuse to give histories. If the percentage of refusals is 10 
per cent or more, then however large the sample, there are no statistical 
principles which guarantee that the results are correct to within 2 or 
3 per cent. The results may actually be correct to within 2 or 3 per cent, 
but any claim that this is true must be based on the undocumented 
opinion that the behavior of those who refuse to be interviewed is not 
very different from that of those who are interviewed. These comments, 
which are not a criticism of KPM’s research, emphasize the difficulty 
of answering the question: “How accurate are the results?”, which is 
naturally of great interest to any user of the results of a sex study. 

4. Many of KPM’s findings are subject to question because of possi- 
ble inaccuracies of memory and report, as are all studies of intimate 
human behavior among broad segments of the population. No one has 
proposed any way to remove the dangers of recall (involving both 
memory and report) and KPM were superior to the nine studies re- 
ferred to above in their attempts to control and measure these dangers. 
We have suggested still further expansions of their methodological 
checks. 

Until new methods are found, we believe that no sex study of inci- 
dence or frequency in large human populations can hope to measure 
anything but reported behavior. It may be possible to obtain observed 
or recorded behavior for certain special groups, but no suggestions have 





676 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


been made by KPM, the critics, or this committee which would make 
it feasible to study observed or recorded behavior for a large human 
population. These remarks are intended as a comment on the present 
status of research techniques in sex studies and not as a criticism of 
KPM’s work. 

5. KPM received only limited statistical help, in part because the 
work was pursued during the War years when such expert help was 
difficult to find for non-military projects. In view of the limited statis- 
tical knowledge which was available to them, as made clear by the 
failure of their sample size experiment, KPM deserve much credit 
for the straight thinking which brought them safely by many pitfalls, 
Their need of adequate statistical assistance continues to be serious, 
Substantial assistance might come through the development of a 
statistical clinic at Indiana University, or through the addition of a 
statistical expert to KPM’s own staff. Unfortunately the sort of assist- 
ance which might resolve some of their most complex problems would 
require understanding, background, and techniques that perhaps not 
more than twenty statisticians in the world pessess. 

6. A probability sampling program should ‘be seriously considered 
by KPM. The actual gains from an extensive program are limited, to 
an extent unknown at present, by refusal rates and indirectly by costs, 
_ particularly by the costs of maintaining the present quality of the indi- 
vidual histories by KPM’s approach. A step-by-step program, starting 
with a very small pilot study, is recommrended. 

7. In addition to proposing a probability sampling program, we 
have made numerous suggestions in this report for the modification 
and strengthening of KPM’s present approach. The suggestions in- 
clude expanded methodological checks of their sampling program, a 
further study of their refusal rate, some modification of their methods 
of analyses, further comparisons of reported vs. observed behavior, 
and stricter interpretations of their data. We have been informed by 
KPM that many of these improvements, including some expansion 
of their techniques for obtaining data, have already been incorporated 
in the volume dealing with sexual behavior in the human female. 


CHAPTER I. BACKGROUND AND ORGANIZATION 
1. Organization involved 


This committee, consisting of William G. Cochran, Chairman, 
Frederick Mosteller, and John W. Tukey, was appointed by President 
S. 8. Wilks in September 1950 as a committee of the Commission on 
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Statistical Standards of the American Statistical Association. This 
action was initiated by a request from the Committee for Research 
on Problems of Sex of the National Research Council, as indicated by 
the following excerpt from a letter dated May 5, 1950, from Dr. George 
W. Corner, a member of the NRC Committee, to Dr. Isador Lubin, 
Chairman of the Commission on Statistical Standards of the American 
Statistical Association. 

“In accordance with our telephone conversation of yesterday, I am writing 
to state to you the desire of the Committee for Research in Problems of Sex, 
of the National Research Council, that the Commission on Standards of the 
American Statistical Association will provide counsel regarding the research 
methods of the Institute for Sex Research of Indiana University, led by 
Dr. Alfred C. Kinsey. 

“This Committee has been the major source of financial support of Dr. 
Kinsey’s work, and at its annual meeting on April 27, 1950, again renewed 
the expression of its confidence in the importance and quality of the work 
by voting a very substantial grant for the next year. 

“Recognizing however that there has been some questioning, in recently 
published articles, of the validity of the statistical analysis of the results 
of this investigation, the Committee, as well as Dr. Kinsey’s group, is 
anxious to secure helpful evaluation and advice in order that the second 
volume of the report, now in preparation, may secure unquestioned ac- 
ceptance.” 


Some correspondence ensued, in which Wilks indicated the willing- 
ness of the American Statistical Association to provide counsel as 
requested. 

Kinsey, in a letter to Wilks dated August 28, stated that 

“we should make it clear that we deeply appreciate the willingness of the 
American Statistical Association to undertake such an examination of our 
statistical methods, that we will give it full cooperation in having access 
to all of our data as far as the peculiar confidential nature of our data will 
allow, and that we understand, of course, that the committee shall be free to 
publish its findings of whatever sort.” 


In the same letter, Kinsey also made a number of suggestions about 
the constitution and work of the committee, to the effect that the 
persons on the committee should be primarily statisticians with experi- 
ence in human population studies, that they should plan to review 
the statistical criticisms which have been published about the book on 
the male, and that they should compare methods used by Kinsey and 
his associates in their research with methods in other published research 
in similar fields. 

With respect to the research on the human female, Kinsey wrote as 
follows: 





678 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


“It should, however, be made clear that all the data that will go into our 
volume on Sexual Behavior in the Human Female are already gathered, that 
the punch cards have already been set up and most of them punched, and 
that statistical work is proceeding on that volume now. While the recom- 
mendations of the committee may modify further work, it can affect this 
forthcoming volume only in the form in which the material is presented, the 
limitations of the conclusions, and the careful description of the limitations 
of our method and conclusions.” 


2. Committee procedure 


Although no specific written directive was issued to the committee, 
the letter quoted earlier from Corner to Lubin sets forth the task as- 
signed to the committee. In one respect the scope was deliberately 
reduced as compared with that envisaged in the letter. The committee 
decided not to undertake any examination of the researches and data 
relating to the human female, in order to avoid disruption of Kinsey’s 
proposed schedule of work. 

In October, 1950, the committee spent five days at the Institute for 
Sex Research of Indiana University, accompanied by Mr. Robert 
Osborn as assistant. Subsequent meetings of the committee were held 
at Chicago (December 1950), Princeton (January 1951), Cambridge 
(May 1951), Baltimore (July 1951) and Princeton (October 1951). 

In their review of previous studies of sexual behavior, the committee 
received major assistance from Dr. W. O. Jenkins, who prepared a 
series of reports which appear in Appendix B. Mr. A. Kimball Romney 
prepared a helpful index of the principal criticisms made of the statisti- 
cal methodology used in the book Sexual Behavior in the Human 
Male. 


3. Structure of this report as a whole 


KPM’s program of research is a major undertaking, involving more 
than ten years’ work. Any discussion of it which aims at thoroughness 
must itself be lengthy. In order to keep the main body of our report 
down to a reasonable length, we have relegated much of the documen- 
tation of our conclusions, and all detailed discussion, to the following 
series of appendices. 


. Discussion of comments by selected technical reviewers. 
. Comparison with other studies. 

. Proposed further work. 

. Probability sampling considerations. 

. The interview and the office as we saw them. 

. Desirable accuracies. 

. Principles of sampling. 
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Appendix A contains our discussion of the statistical and quantita- 
tive methodological content of six of the critical reviews which ap- 
peared after the publication of the KPM book. These six were chosen 
from among the large number of published reviews, because they 
concentrated their attention on the statistical aspects of the research. 
Appendix A also includes, where this seems appropriate, discussion of 
some critical points which were not explicitly raised in the reviews in 
question. 

Appendix B, by W. O. Jenkins, contains a review of the statistical 
aspects of eight of the major previous sex studies which have been car- 
ried out in the United States. Also included are similar reviews of the 
KPM book and of one more recent study by J. E. Farris. The purpose 
of this appendix is to provide a basis for comparing the KPM study 
with the other studies as to comprehensiveness, sampling methods, 
interviewing methods and statistical analysis. 

Appendix C begins by outlining and commenting on suggestions for 
further work made by the reviewers. It explains the difficulty of esti- 
mating the stability of results from a sampling procedure such as 
KPM’s, offers some possible methods for this estimation, and suggests 
how more appropriate variables for expressing sexual behavior might 
be developed, and how compound variables might be built on these. 
It then explores the problem of when to adjust, giving a simple numeri- 
cal procedure for making the decision, and concludes by summarizing 
the probability sampling suggestions derived from Appendix D. 

Appendix D discusses the problems of analysis and usefulness of 
probability sampling as a check on a nonprobability sample, particu- 
larly when refusal rates are considered; two possible types of probabil- 
ity samples and a probability sampling program which KPM might 
undertake; and the alternative of studying restricted populations. 

Appendix E discusses the interview and the office as we saw them. 
Appendix F discusses what seems to be known about the accuracy 
needed in such work as KPM’s. Appendix G presents an account of 
the principles of sampling illustrated with general examples. 

Many of the problems faced by KPM occur in most types of soci- 
ological investigation. Some are likely to be encountered in almost any 
kind of scientific investigation. For this reason, we have thought it 
advisable to present certain of the methodological issues in rather 
general terms. 

The reader is asked to bear in mind that in general our conclusions 
are not documented in the main body of the report, but in the appen- 
dices to which references are given. 
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4. Structure of the main body 


In preparing the main body, we have stressed easy reference and 
have kept related matters together at the expense of fluency of arrange- 
ment and lack of repetition. Thus our main conclusions in a form in- 
tended for the general reader take 3 pages in the digest above, while 
more detailed conclusions, expressed for a more technical audience, take 
3 pages in Chapter XI. A particular subject summarized there is also 
likely to be discussed once in Chapter II, where we try to point out 
what KPM did, once again in one of Chapters IV to IX, where we 
assess KPM on an absolute scale, and yet again in Chapter X, where 
we compare KPM with previous workers in the field. This is repetitive, 
but we hope that it will permit ready reference and avoid treating 
subjects out of context. 

After this introductory chapter on background structure, the re- 
mainder of the main body falls into three parts: 





682 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


(i) Chapters II and III. In the first of these, we describe, respectively, 
what choices KPM had to make and what they chose. In Chapter II] 
we outline some essential principles of sampling, which seem not to 
have been clearly enough formulated or widely enough understood, 
These chapters are introductory. 

(ii) Chapters IV to XI. In the first six of these, we try to compare 
KPM’s work with an absolute standard. The order chosen (interview, 
- sample, methodological checks, analytical techniques, complex exam- 
ples, interpretation) is that in which the problems arise in an evolving 
study such as KPM’s. Chapter X compares KPM with previous works 
on the basis of Appendix B, while Chapter XI summarizes the conclu- 
sions of this part. 

(iii) Chapter XII. This discusses briefly various suggested expendi- 
tures of further effort. 


CONTENTS 
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. Structure of the main body 


Cuap. II. Mayor AREAS OF CHOICE 


. What sort of behavior? 

. Whose behavior? 

. Observed, recorded or reported behavior 

. Interview or questionnaire, and types thereof 
. Which subjects? 


. Introduction 

. Cluster sampling 

. Possibilities of adjustment 

. Probability samples 

. Nonprobability samples 

. Sampled population and target population 


Cuap. IV. Tue INTERVIEW AREA 


. Interview vs. questionnaire 
. Interviewing technique 





ee 62495 649 85 85 8D BD BOD 


sTATISTICAL PROBLEMS OF THE KINSEY REPORT 


Cuap. V. THE SAMPLING AREA 


. KPM’s sampled population 
- Could KPM have used probability sampling?..................... 
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. KPM’s checks 
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. Variables affecting sexual behavior 

. Definition of the variables 

. Assessing effects of variables 

. The measurement of activity 

. Tests of significance 
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Cuap. XII. Suacestep Extensions 


. Probability sampling 
. Retakes 

. Spouses 

. Presentation 

. Statistical analyses 

. Relative priorities 


CHAPTER II. MAJOR AREAS OF CHOICE 
5. What sort of behavior? 


The purpose of Chapter II is to record in summary form the major 
choices made by KPM. 

Certainly the choice of orgasm as the central sort of sexual behavior 
for study was a major one, leading to consequences whose statistical 
aspects will be discussed in various places, but this choice is not a mat- 
ter of general quantitative methodology, and hence falls outside the 
scope of this committee’s task. 


6. Whose behavior? 


KPM had to choose the population to which this study should apply. 
This decision does not seem to have been made clearly. From the basis 
for the “U. S. Corrections” (p. 105) we should infer it to be “all U. 8. 
white males.” If it were the population to which the U. 8S. Corrected 
sample actually applies on the average (the sampled population, see 
Section 18), it would be a rather odd white male U. S. Population. 
It would have age groups, educational status, rural-urban background, 
marital status and all their combinations according to the 1940 census, 
but it would have more members in Indiana than in any other state, 
and it would have been selected to an unknown degree for willingness 
to volunteer histories of sexual behavior. We do not regard this descrip- 
tion of the sampled population as an automatic criticism, as some crit- 
ics do. We make it here as a factual statement, noting that the careful 
and wise choice of the sampled population, although difficult, is a rela- 
tively free choice of the investigator. More discussion relevant to this 
point will be found in Chapter II-G (Appendix G). 

Further, KPM chose to study the behavior of many (at least 163 
in tabular form) segments of this large population, feeling, apparently, 
both that comparisons among segments would be illuminating and 
that data for (clinical) application to individuals should come from 4 
reasonably homogeneous segment. KPM’s choice of a broad population 
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created many problems, particularly in sampling. Whether they would 
have been well advised to confine themselves to a more restricted popu- 
lation, e.g., the state of Indiana, is debatable. For our part, we are willing 
to take their choice as given, and to discuss briefly elsewhere some alter- 
natives for further work (Chapter IX-D). 


7. Observed, recorded, or reported behavior 


KPM, interested in actual behavior, had, in principle, the choice 
of studying observed, recorded, or reported behavior. But since they 
selected a broad population and orgasm as the type of behavior, their 
only feasible choice seems to have been reported behavior. This situa- 
tion does not seem likely to change in the foreseeable future. 

The choice of reported behavior implies that the question: ‘On the 
average, how much difference is there between present reported and 
past actual behavior?” is seriously involved in any inferences about 
actual behavior which are attempted from KPM’s results. The differ- 
ence might well be large, leading to a large systematic error in measure- 
ment. However, use of observed or recorded behavior in order to avoid 
this difference does not seem to us a feasible way to measure nation- 
wide incidences and frequencies for KPM’s broad population, because 
it would have produced systematic errors in sampling possibly larger 
than the error in measurement. 


8. Interview or questionnaire, and types thereof 


Having settled on reported behavior, KPM had to decide whether 
this report should be oral or written, and what methods should be 
used to elicit it. Their choice was oral, in a face-to-face interview 
whose flavor was designed to be that of a doctor or family friend. 
The choice of oral rather than written report: 


(1) made it possible to obtain apparently satisfactory answers from 
many more subjects (the percentage of complete illiteracy in the 
U. S. is small, but the percentage of illiteracy on complex sub- 
jects not usually written about is undoubtedly substantial). 

(2) permitted and encouraged variation of the form of the questions 
to suit the subject and the situation. 


Those, like some critics, who believe in a repeatable measurement 
process, regardless of whether or not it measures something that is 
always relevant, find (2) bad. Those who, like KPM, feel that appro- 
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priately flexible wording improves communication and thus improves 
the quality of report despite the variability resulting from changes 
in the form of questions, find (2) good. 

Given an interview rather than a questionnaire, the remaining 
choices of KPM follow a consistent pattern. In nearly every cage 
their approach resembled the clinical interview more closely than the 
psychometric test. 


9. Which subjects? 
Here there are various choices, pertaining to: 


(1) selection of individuals one at a time or in clusters. 

(2) keeping age, education, marital status, etc., segments in the sam- 
ple proportionate to those in the population or making them of 
more nearly equal size. 

(3) selecting individuals on a catch-as-catch-can basis, a partly ran- 
domized basis, or according to a probability sampling plan. 


They chose: 


(1) to select individuals in clusters. 
(2) to keep age, education, marital status, etc., segments more nearly 
equal in the sample than in the population. 


(3) to use no detectable semblance of probability sampling ideas. 


The pros and cons will be discussed later. 


10. What methodological checks? 


There are choices as to the types of checks and the number of each 
to be made. The types of checks made by KPM, including 


(1) take-retake, 

(2) husband-wife, 

(3) duplicate recording of interview, 
(4) overall comparison of interviews, 
(5) others (see Chapter V-A) 


seem to cover all those easily thought of. The numbers of checks made 
are discussed later. Duplicate recording of interviews occurred in an 
unknown, but presumably small, number of cases. No comparisons 
from duplicate recordings were reported, perhaps because most oc- 
curred in connection with the training of interviewers. 


~~ tte co Cd cad ae a le 
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11. How analyzed and presented? 


In analyzing frequency and incidence of activity, KPM chose to 
report both raw and “U.S. Corrected” data and to make simple com- 
parisons. Just what was done in general was clearly stated, but the 
steps involved in detailed computations were not explained. No at- 
tempt was made to find helpful scales or composite variables (see 
Chapters IV-C and V-C). 

With the exception of “U. S. Corrections,” most of the analysis of 
the tabular data is confined to straightforward description. Some at- 
tention is paid to the problem of sample-population relation in the form 
of standard errors (presumably underestimated because they were 
based on the assumption of random sampling). However, this ap- 
proaches lip service, since many apparent differences are discussed 
with no attention to significance or nonsignificance. (Again we do not 
regard this as an automatic criticism, particularly since accurate indi- 
cation of significance would have been difficult—see Section A-18.) 

In analyzing cumulative activity, KPM’s main tool was the accumu- 
lative incidence curve, a technique which they developed independ- 
ently. 


12. How interpreted? 
The main choices concerned 


(1) extent of warning about possible differences between reported 
behavior and actual behavior, 

(2) extent of warning about possible differences between the sam- 
pled population (see Section 18) and the entire U.S. white male 
population, 

(3) extent of warning about sampling fluctuations, 

(4) extent of verbal discussion not based on evidence presented, 

(5) certainty with which conclusions were presented. 


Under (1) the emphasis was on methodological checks in order to indi- 
cate, as far as they could, how small this difference seemed to KPM 
to be. Under (2) there was little discussion. Under (3) the warnings 
were made early, incompletely, but not often. Under (4) the extent 
of discussion was substantial, most of it aimed at social and legal atti- 
tudes about sexual behavior, and descriptions or practices not covered 
by the tables. Under (5) the conclusions were usually presented with an 
air of solid certainty. 

In general the observations seem to have been interpreted with more 
fervor than caution, although occasional qualifications may be found. 
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CHAPTER III. PRINCIPLES OF SAMPLING 
13. Introduction 

It is difficult, if not impossible, to assess the quality of any sample 
and its analysis without comparing it with a set of principles. This is 
particularly true of KPM’s works. The present chapter endeavors to 
set down, in compact form, a few of the principles of sampling which 
are especially relevant to a consideration of KPM’s sampling. As we 
have noted (Section 6), KPM chose to select individuals in groups or 
clusters, to divide the population into segments and keep segment 
sizes more nearly equal in the sample than in the population, and to 
use no semblance of probability sampling ideas. The discussion in this 
chapter concentrates on these aspects of sampling. 

Many readers will, we believe, desire a more connected account of 
the principles of sampling, with examples and fuller discussion. These 
are provided in Appendix G. Any reader who finds the statements 
used in this chapter unclear, or not intuitively acceptable, is urged to 
turn to Appendix G before proceeding further. Once there, he should 
read through from the beginning, since argument and exposition there 
are closely knit and unsuited to piecemeal references. 

Whether by biologists, sociologists, engineers, or chemists, sampling 
is often taken too lightly. In the early years of the present century, it 
was not uncommon to measure the claws and carapaces of 1000 crabs, 
or to count the number of veins in each of 1000 leaves, and to attach 
to the results the “probable error” which would have been appropriate 
had the 1000 crabs or the 1000 leaves been drawn at random from the 
population of interest. If the population of interest were all crabs in a 
wide-spread species, it would be obviously almost impossible to take 
a simple random sample. But this does not bar us from honestly assess- 
ing the likely range of fluctuation of the result. Much effort has been 
applied in recent years, particularly in sampling human populations, 
to the development of sampling plans which, simultaneously, 


(i) are economically feasible, 
(ii) give reasonably precise results, and 
(iii) show within themselves an honest measure of fluctuation of 
their results 


Any excuse for the practice of treating non-random samples as random 
ones is now entirely tenuous. Wider knowledge of the principles involved 
is needed if scientific investigations involving samples (and what such 
investigation does not involve samples?) are to be solidly based. 
Additional knowledge of techniques is not so vitally important, though 
it can lead to substantial economic gains. 
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14. Cluster sampling 


A botanist who gathered 10 oak leaves from each of 100 oak trees 
might feel that he had a fine sample of 1000, and that, if 500 were 
infected with a certain species of parasites, he had shown that the 
percentage infection was close to 50%. If he had studied the binomial 
distribution, he might calculate a standard error according to the usual 
formula for random samples, p++/pq/n, which in this case yields 
50+1.6% (since p=q=.5 and n= 1000). In doing this he would neglect 
three things: 


(i) probable selectivity in selecting trees (favoring large trees, per- 
haps? 

(ii) probable selectivity in choosing leaves from a selected tree (fav- 
oring well-colored or alternatively, visibly infected leaves per- 
haps and 

(iii) the necessary allowance, in the formula used to compute the 
standard error, for the fact that he had not selected his leaves 
individually. 


Most scientists are keenly aware of the analogs of (i) and (ii) in their 
own fields of work, at least as soon as they are pointed out to them. 
Far fewer seem to realize that, even if the trees were selected at ran- 
dom from the forest, and 10 leaves were chosen at random from each 
selected tree, (iii) must still be considered. But if, as might indeed be 
the case, each tree were either wholly infected or wholly free of infec- 
tion, then the 1000 leaves tell us no more than 100 leaves, one from 
each tree, since each group of 10 leaves will be all infected or all free 
of infection. In this event, we should take n=100 in calculating the 
standard error and find an infection rate of 50+5%. Such an extreme 
case of increased fluctuation due to sampling in groups or clusters 
would be detected by almost all scientists, and is not a serious danger. 
But less extreme cases easily escape detection. 

We have just described, as one example of the reasons why the 
principles of sampling need wider understanding, an example of 
cluster sampling, where the individuals or sampling units are not 
drawn separately and independently into the sample, but are drawn 
in clusters, and have tried to make it clear that “individually at ran- 
dom” formulas do not apply. Cluster sampling is often desirable, but 
must be analyzed appropriately. KPM’s sample was, in the main, 
a cluster sample, since they built up their sample from groups of people 
rather than from individuals. 
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15. Possibilities of adjustment 


Often the population is divided into segments of known relative size, 
perhaps from a census. It is sometimes thought that the best method 
of sampling is to take the same proportion from every segment, so 
that the sample sizes in the segments match the corresponding popula- 
tion sizes. Such samples do have the advantage of simplifying computa- 
tions by equalizing weights, and they sometimes lead to a reduction 
of sampling error. But modern sampling theory shows that optimum 
allocation of resources usually requires different proportions to be 
sampled from different segments, whether the purpose is to estimate 
average values over the population or to make analytical comparisons 
between results in one group of segments and those in another. 

When there are disparities in the relative sizes of segments in the 
sample as compared with the population, whether accidental or 
planned, these disparities must be taken into account when we at- 
tempt to estimate averages over the whole population. One way in 
which this can be done is by adjustments applied to the segments. Such 
adjustments proceed as follows. Suppose that we know 


(i) the true fraction of the population in each segment, and 
(ii) the segment into which each individual in the sample falls. 


Then we can weight each individual in the sample by the ratio 


fraction of population in that segment 





fraction of sample in that segment 


(It is computationally convenient to weight each segment mean with 
the numerator of this ratio; the result is algebraically identical to that 
described above.) 

The result of adjustment is a new “sampled population”—one such 
that the relative sizes of its various segments are very nearly correct 
(according to (i) above). Since the weight is the same for all the sample 
individuals in a given segment, adjustment does nothing to redress 
any selectivity which may be present within segments. If we adjust 
in this way, we remove one source of systematic error without affecting 
other sources at all. The philosophy of such adjustments is discussed 
further in Section G-12, and it is concluded that they may generally 
be appropriately made (within the limits discussed in sections C-16 
—C-18). Their chief danger is the possible neglect of the possibilities 
that they may be 
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(i) entirely too small, 
(ii) too large, 
(iii) in the wrong, direction, 
because of unredressed selectivity within the segments. When this pos- 


sibility exists, extreme caution in presenting the results of adjustment 
is indicated. 


16. Probability samples 


When probability samples are used, inferences to the population 
can be based entirely on statistical principles rather than subject- 
matter judgment. Moreover, the reliability of the inferences can be 
judged quantitatively. A probability sample is one in which 


(i) each individual (or primary unit) in the sampled population has 
a known probability of entering the sample, 

(ii) the sample is chosen by a process involving one or more steps of 
automatic randomization consistent with these probabilities, 
and 

(iii) in the analysis of the sample, weights appropriate to the proba- 
bilities (i) are used. 


Contrary to some opinions, it is not necessary, and in fact usually not 
advisable in a pure probability sample for 


(i) all samples to be equally probable, or 
(ii) the appearance of one individual in the sample to be unrelated 
to the appearance of another. 


In practice, because some respondents cannot be found or are unco- 
operative, we usually obtain, at best, approximate probability samples 
(see Sections A-2 and D-13) and have approximate confidence in our 
inference. 


17. Nonprobability samples 


Samples which are not even approximately probability samples 
vary widely in both actual and apparent trustworthiness. Their trust- 
worthiness usually increases as they are insulated more and more 
thoroughly from selective factors which might be related to the quanti- 
ties being studied. Insulation may be obtained by: 


(i) adjustments applied to the segment means in the sample, 
(ii) examination of the sample as drawn for signs of selection on a 
particular factor, 
(iii) partial randomization. 
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Adjustment for segments, as explained in Section 15 above, corrects 
for any selective factor operation between segments, but corrects not 
at all for selective factors operating within segments. If adjustment is 
to be used, deliberate selectivity between segments may be exercised 
without danger, so long as it does not imply selectivity within segments, 

Negative results when the sample is examined for signs for selection 
on a particular variable are comforting, and strengthen the reliability 
of the sample. The amount of this strengthening depends very much 
on the a priori importance of the variables checked to what is being 
studied. 

Deliberate (partial) randomization is a step toward a probability 
sample, and may be very helpful on occasion. 


18. Sampled population and target population 


We have found it helpful in our thinking to make a clear distinction 
between two population concepts. The target population is the popula- 
tion of interest, about which we wish to make inferences or draw 
conclusions. It is the population which we are trying to study. The 
sampled population requires a more careful definition but, speaking 
popularly, it is the population which we actually succeed in sampling. 

The notion of a sampled population can be more clearly described 
for probability sampling. In order to have probability sampling, we 
must know the chance that every sampling unit has of entering the 
sample, and the weight to be attached to the unit in the analysis. 
The sampled population may be defined as the population generated 
by repeated application of these chances and these weights. The fre- 
quency of occurrence of any particular sampling unit in the sampled 
population is proportional to the product 


(chance of entering the sample) X (weight used in analysis). 


This product is made constant for a probability sample. Thus, with 
probability sampling, the sampled population consists of all sampling 
units which have a non-zero chance of selection. 

The sampled population is an important concept because by statisti- 
cal theory we can make quantitative inferential statements, with known 
chances of error, from sample to sampled population. It must be 
carefully distinguished from the target population, the population of 
interest, about which we are tempted to make similar inferential state- 
ments. 

Even with probability sampling, the sampled and the target popula- 
tion usually differ because of the presence of “refusals,” “not-at- 
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homes,” “unable to classify,” and so on. The consequence of these 
disturbances is that certain sampling units, although assigned a known 
chance of selection by the sampling plan, did not in fact have this 
chance in practice. 

With non-probability sampling, the situation is much more obscure. 
By its definition as given above, the sampled population depends on 
the existence of a sampling plan (which may be only a vague set of 
principles in the investigator’s head) and on the “chances” that any 
sampling unit had of being drawn. These chances are not well known— 
if they were, we should have a probability sample. But in many cases, 
it is reasonable to behave as if these chances exist and to attempt to 
estimate them, because they provide the only means of making statis- 
tical inferences beyond the non-probability sample to a corresponding 
“sampled population.” The difficulty comes in specifying, or some- 
times even thinking about, the nature of the sampled population. It is 
certain to be a weighted population where, for example, Theodosius 
Linklater may appear 1.37 times, while Basil Svensson appears only 
0.17 times. 

Insofar as we make statistical inferences beyond the sample to a 
larger body of individuals, we make them to the sampled population. 
The step from sampled population to target population is based on 
subject-matter knowledge and skill, general information, and intuition 
—but not on statistical methodology. 


CHAPTER IV. THE INTERVIEW AREA 


19. Interview vs. questionnaire 

The committee members do not profess authoritative knowledge 
of interviewing techniques. Nevertheless, the method by which the 
data were obtained cannot be regarded as outside the scope of the 
statistical aspects of the research. 

For what our opinion is worth, we agree with KPM that a written 
questionnaire could not have replaced the interview for the broad 
population contemplated in this study. The questionnaire would not 
allow flexibility which seems to us necessary in the use of language, in 
varying the order of questions, in assisting the respondent, in following 
up particular topics and in dealing with persons of varying degrees of 
literacy. This is not to imply that the anonymous questionnaire is 
inherently less accurate than the interview, or that it could not be 
used fruitfully with certain groups of respondents and certain topics. 
So far as we are aware, not enough information is available to reach a 
verdict on these points. 
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20. Interviewing technique 


Many investigators have faced the problem of attempting to obtain 
accurate information about facts which the respondent is thought to 
be unwilling to report. It is natural to inquire whether KPM, in their 
interviewing technique, took advantage of accumulated experience 
as to the best methods for extracting the facts. But it is also well to 
inquire how much definite experience has been accumulated. 

The KPM interview impressed us as an extraordinarily skillful per- 
formance. Direct questions are put rapidly in an order which seems to 
these respondents hard to predict, so that it is difficult to tell what is 
coming next. Despite the air of briskness, we did not receive the im- 
pression that we were being hurried if we wished to reflect before re- 
plying, and supplementary questions or information were given if this 
seemed helpful to the memory. The coded recording of the data was 
done unobtrusively by the interviewer, so that the interview appeared 
to be a friendly conversation rather than any kind of an inquisition. 
These, of course, are personal impressions. 

KPM evidently think highly of the virtues of this technique, because 
it was adopted despite limitations which it imposes on the scope and 
rate of progress of the study. The technique makes great demands on 
the interviewer. The long period of training and the personal qualities 
required have restricted and will continue to restrict the interviewers 
to a very small number. This limits the speed with which data can be 
accumulated and also puts restrictions on the type of sampling that 
can be employed. 

The type of interview used by KPM differs markedly from the 
less directive methods which are sometimes recommended for dealing 
with taboo subjects. If the subject is likely to feel that his answer to a 
certain question will affect his prestige in the eyes of the interviewer, 
a less directive approach would be to conduct the interview in such a 
way that he gives the desired information without realizing that he is 
answering the awkward question. The KPM method is the antithesis 
of this. Research on interviewing techniques has not yet produced any 
substantial body of evidence as to the superiority of either the less 
directive methods or the KPM technique. 

With regard to specific inaccuracies in the KPM data, we believe that 
the interview gives an opportunity both for positive and negative bias. 
The KPM assumption that everyone has engaged in all types of ac- 
tivity seems to some likely to encourage exaggeration by the respond- 
ents. (KPM feel (personal communication) that their cross-checks are 
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highly effective in detecting such exaggeration.) On the other hand, 
our impression from the interview was that a successful denial of cer- 
tain types of activity would be possible if the subject was prepared to 
do so, although we do not know the full extent of the KPM cross- 
checks which would lead them to be suspicious of such a denial. KPM 
assert (personal communication) that they regard cover-up as @ more 
likely source of bias than exaggeration. Our opinions on this statement 
are divided. 

As KPM point out (p. 48), the subject’s willingness to talk about 
certain types of activity is influenced by the attitudes of the social 
group to which he belongs. Until evidence to the contrary is presented, 
the presumption (made by some of the critics) that his final responses 
will also be influenced is one that cannot be cast aside. The size of these 
influences is still a matter of opinion. A corresponding element of doubt 
is present in almost all comparisons between different social levels, 
both those which provide some of the most interesting comparisons in 
the book, and those in many other studies. 


CHAPTER V. THE SAMPLING AREA 
21. KPM’s sampled population 


As noted above, KPM’s sample was deliberately disproportionate, 
partly in order to cover individual segments defined by age, education, 
religion, etc., in an adequate manner, partly because of geographical 
convenience. If the results for individual segments were to be based 
on samples of at least moderate size, such disproportion was necessary 
and wise. Its effects on overall results are less clear. It seems impossible 
to be sure what effect it had on the variability of the final result, and 
its use is certainly not a demonstrable error as far as variability is 
concerned. 

In their U. S. corrections, KPM provided adjustments for dispro- 
portion between segments defined by age, education, and marital status. 
As noted above (Section 17) we feel that such adjustments are usually 
appropriate. Due to absence of population data, they did not adjust 
for religion. The geographical imbalance of their sample was so great 
that an overall geographic adjustment was not feasible. Thus they com- 
pensated for some disproportions, and left others to produce what 
effects they would. 

Their only examination of the sample for signs of selection within 
segments is their comparison of 100% groups (groups where all mem- 
bers were interviewed) with partial groups (groups where only part of 
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the members were sampled). This gives some insight into the effect of 
volunteering as a selective factor. Beyond this, KPM report no serious 
effort to measure the actual effect of volunteering, or to discover what 
percentage of the population they would be able to persuade to be inter- 
viewed. 

They made no use of randomization. They might have attempted to 
sample, say, college seniors from two colleges drawn at random from a 
large list of colleges, but they are of the opinion (personal communica- 
tion) that this would have slowed up the work to an unmanageable 
extent. 

All in all, the absence of any orderly sampling plan contrasts strik- 
ingly with their usual methodical mode of attack on other problems. 

As stated briefly above (Section 6), the “sampled populations” 
corresponding to 


(1) KPM’s raw means, and to 
(2) KPM’s “U.S. corrected” means, 


respectively, are startlingly different from the composition of the U. S. 
white male population. (For example, although these sampled popula- 
tions have the U. S. average combination of education and rural- 
urban background, they have half of their members living in Indiana.) 
Since a complete probability sample seems to have been out of the 
question at the beginning of the KPM investigation, some such “sam- 
pled population” was to be expected, although it might have been some- 
what less distorted. Provided that further statistical analyses of the 
sort indicated in Appendix C, Chapter II-C were made, it would be 
possible to make adequate rigorous inferences from the sample to 
this ill-defined “sampled population.” 

The inference from these vague entities to the U. S. white male 
population depends on: 


(a) the inferrer’s view as to what these “sampled populations” are 
really like, and 

(b) the inferrer’s judgment as to how (reported) sexual behavior 
varies within segments. 


It is not surprising that experts disagree. 

The inference from KPM’s sample to the (reported) behavior of all 
U. S. white males contains a large gap which can be spanned only by 
expert judgment. This is a common phenomenon in social fields, but 
is still unfortunate. A considerable bridge across this gap would be 
furnished by a small probability sample. 
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22. Could KPM have used probability sampling? 


If probability sampling could have been used, its use would have 
avoided one of the main gaps in KPM’s present chain of inference. 
We have, therefore, considered this possibility carefully. 

The difficulties in applying probability sampling to KPM’s study 
lie in the expenditure of time required to make the contacts necessary 
to persuade a predesignated man to give a history. By adapting the 
mechanism of the probability sample to KPM’s situation, these dif- 
ficulties may perhaps be reduced (see Appendix D, Chapter V-D). 
It would almost certainly have been impractical for KPM to have used 
a probability sample in the early years of their study. If KPM’s ap- 
parent “opinions” (p. 39 of KPM) as to the effectiveness of their pres- 
ent techniques of contact are correct, starting a probability sample 
would have been practical at any time since the appearance of the 
male volume in 1948.' However, KPM (personal con.munication, 1952) 
feel that such an interpretation of their written statement is unwar- 
ranted. 

Since it would not have been feasible for KPM to take a large sam- 
ple on a probability basis, a reasonable probability sample would be, 
and would have been, @ small one, and its purpose would be: 


(1) to act as a check on the large sample, and 
(2) possibly, to serve as a basis for adjusting the results of the large 
sample. 


A probability sampling program planned to serve these purposes is 


discussed in Appendix D, Chapter VII-D. Such a program should 


proceed by stages because of the absence of information on costs and 
refusal rates. 

This conclusion about probability sampling does not excuse KPM 
from the responsibility for choosing geographical disproportion in 
order to save travel time and expense. The wisdom or unwisdom of this 
choice seems to depend on one’s view as to the magnitude of geographi- 
cal differences. Again, it is not surprising that experts disagree. 


CHAPTER VI. METHODOLOGICAL CHECKS 


23. Possible checks 


The primary check, if it could be made, is the comparison of average 
actual behavior with average reported behavior. Variability in the dif- 





1 “The number of persons who can provide introductions has continually spread until now, in the 
present study, we have a network of connections that could put us inte almost any group with which 
we wished to work, anywhere in the country.” (P. 39 of KPM.) 
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ference between actual and reported behavior is secondary in interest, 
because high variability merely implies the necessity of larger numbers 
of cases, while large average differences between actual and reported 
behavior respresent a systematic error that cannot be adjusted without 
rather complete knowledge. Unfortunately this primary check does not 
at present seem feasible in studying human sexual behavior as it occurs 
in our culture. 

Of secondary importance are checks of the single actual report with 
the average actual report, where averages may be taken over fluctua- 
tions, time, spouses, and/or interviewers. (See Appendix A, Chapter 
V-A) In this second category, the following possible comparisons sug- 
gest themselves: 

1. Reinterviews of the same respondent 

2. Comparison of spouses 

3. Comparison of interviewers on the same population segment 

4. Duplicate interviews by the same interviewer at various times. 


24. KPM’s checks 


The only comparison of observed and reported behavior which KPM 
found feasible was the date of appearance of pubic hair, which agreed 
quite successfully. This is a physical characteristic, different in char- 
acter and emotional loading from the behavior of main interest. Some 


subjects may have had to rely upon general information, plus some 
assistance from the interviewer, in naming a date for themselves. 
Thus this check furnishes rather weak support. 

At the level of rechecks on respondents, some information is avail- 
able but more is needed. Similarly, comparisons of spouses have been 
made for a relatively selected group. The checks themselves are en- 
couraging, but more cases are needed. 

Some attempts have been made to compare the staff interviewers 
but since there is some selection in the assignment of cases, these 
comparisons do not meet the problem as squarely as interviews of the 
same respondent by different interviewers, or the recorded interview 
technique. 

A comparison of early versus late interviews by Kinsey is given in 
KPM, but it is hard to tell, for example, whether the 12.4% drop 
(from 44.9% to 32.5%) in the accumulative incidence for total pre- 
marital intercourse at age 19 (single males, education level 13+-) from 
early to late interviews is due to differing groups sampled, instability 
in the interviewing process, or reasonable sampling variation for cluster 
sampling (KPM p. 146), 
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KPM have made serious efforts to check their work in the aspects 
where checking seems feasible. However, improved and more extensive 
checking is needed. Although duplicate recording of interviews is men- 
tioned, no data have been published. Even if they must be based on 
very few cases, such comparisons should be made available. 


CHAPTER VII. ANALYTICAL TECHNIQUES 
25. Variables affecting sexual behavior 


After introductory chapters (5 and 6) on early sexual growth and 
activity, KPM proceed to examine the effects of the following vari- 
ables: 

Age 

Marital status 

Age of adolescence 

Social level 

Comparison of two generations 

Vertical mobility in the occupational scale 
Rural-urban background 

Religious background 


In this chapter we attempt to appraise, in general terms, the analytical 
techniques used by KPM in their study of these variables. 


26. Definition of the variables 


Some of the variables: age of adolescence, social level, occupational 
level, rural-urban background and religious background, involve prob- 
lems of definition. These seem to have been in the main thoughtfully 
handled and presented by KPM. For instance, KPM discuss the rela- 
tive merits of educational level attained by the subject and of the oc- 
cupational class of the subject and of his parents as a measure of social 
level (pp. 330-32). In their opinion, educational level is the most satis- 
factory criterion and this was adopted for the analysis. In the case of 
religious affiliation, KPM distinguish between active and inactive pro- 
fession of religious faith, though the definition of the two terms is not 
made entirely clear. 

The definition which looks least satisfactory is that of age of adoles- 
cence (p. 299), where the problem is formidable. The criteria employed 
by KPM appear difficult for the reader to interpret. 


27. Assessing effects of variables 


With a multiplicity of variables which may interact on each other, 
the task of assessing the importance of each variable individually is 
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not easy. Examination of the variables one by one, ignoring all other 
variables except the one under scrutiny, may give wrong conclusions, 
because what appears on the surface to be the effect of one variable 
may be merely a reflection of the effects of other variables. 

A thorough attack on this problem calls for a multiple-variable ap- 
proach in which all effects are investigated simultaneously. This re- 
quires a high degree of statistical maturity and of skill in presentation. 

The method utilized by KPM is a compromise. In general, with some 
exceptions, they regard age, marital status and educational level as 
basic variables, which are held fixed or compensated for in the investi- 
gation of each of the remaining variables. The other variables are dis- 
regarded for the moment. A\though we have not examined the matter 
exhaustively, this policy seems to have been justified by events, be- 
cause KPM claim from their analyses that the other variables, with 
the exception of age at adolescence, have had relatively minor effects. 


28. The measurement of activity 


In the KPM tables, activity is measured by “incidence” (per cent of 
the population who engage in the activity) as well as by frequency per 
week. In some tables, both mean and median frequencies are given, 
and also frequencies for the total and for the active population. There 
are advantages in presenting various measures. On the other hand, 
inspection suggests that all these measures are correlated: that is, 
to some extent they tell the same story. A complex internal analysis 
would probably show about how many measures are really needed to 
extract the information in the data and what individual measurements, 
or combinations of them, are best for this purpose. Perhaps a single 
one, or at most two, would suffice. As it is, both KPM and the indus- 
trious reader have to wade through tables and discussion of a number 
of different measurements, without being clear whether anything new 
is learned. Simplification would be pleasant, but is far from essential. 


29. Tests of significance 


In the discussion of effects which they regard as real, KPM make 
little appeal to tests of significance. They often present standard errors 
attached to the mean frequencies for individual cells. Because sampling 
was non-random and was by groups, these standard errors, calculated 
on the assumption of randomness, are under-estimates, perhaps by a 
substantial amount. The standard errors have a kind of negative vir- 
tue, in the sense that if a difference is not significant when judged 
against these errors, it would not be significant if a valid test could be 
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devised. The problem of devising a realistic estimate of the true stand- 
ard errors is one of considerable complexity (see Section II-C). 

We have been unable to discover from the book the principles by 
which KPM decide when to regard an effect as real. The size of the 
effect is one criterion. Size should certainly be taken into account, 
since an effect may be significant statistically but too small to be of 
biological or sociological interest. They evidently attach some impor- 
tance to the consistency with which an effect is exhibited in different 
parts of a table. As a criterion, consistency is of variable worth. Con- 
sistency over different age groups (where age denotes age at the time 
of the reported activity) is of little worth, since there is inevitably 
substantial correlation between sampling fluctuations of reported 
activities at neighboring ages because the same subject appears in 
neighboring age groups. More weight can be attached to consistency 
over different educational levels, because different groups of subjects 
are involved. 

To summarize, statements about the data in their tables lie at the 
level of shrewd descriptive comment, rather than at the level of an 
attempt to make inferential statements from a sample to a clearly de- 
fined population (even though this could not be the U. S. white male 
population). 

We do not propose to discuss the analysis for each variable sepa- 
rately. Two analyses which have attracted much attention will be con- 
sidered later (Sections 33 to 37). 


30. U. S. Corrections 


In most sampling plans it is necessary to provide a set of weights 
for the segments of the sampled population to recover accurate esti- 
mates for the target population (i.e. the population about which 
inferences are desired). That such adjustments are usually appro- 
priate, whether probability or nonprobability samples are employed, 
has already been pointed out (Section 17, see Section II-G). 

Since KPM have as their target population U.S. white males, we can 
reasonably expect them to apply weights in an attempt to correct 
for disproportionate representation in the sampled population of some 
segments of the target population. 

KPM supply U. S. Corrections (p. 106-9) and use them rather con- 
sistently throughout the work. There are no examples given explaining 
the application of the weights. The critics, and sometimes this commit- 
tee, have had difficulty in verifying computations where they have 
been used. Of the 13 tables where corrections could be checked com- 
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pletely, one checked, 10 checked except for one age group each, and two 
were not checked by the correction mentioned in the text. Apparently 
the exposition could be improved. 

The U. S. Corrections should be used, but it might be possible to 
make a more effective choice of segments (see A-43 and V-C and II-G), 

KPM did not sufficiently warn the reader that U.S. corrected figures 
are not corrected for selection within segments, and may be seriously 
biased. 


31. The accumulative incidence curve 


KPM have a useful device for summarizing incidence data by age. 
This accumulative incidence curve gives the percentage of individuals 
in the sample (reporting for a given age) to whom a particular event 
has occurred before that age. Although the explanation of the concept 
of accumulative incidence is not as clear as most of KPM’s writing, the 
computations made are satisfactory. When there are no generation- 
to-generation changes in the population and no differential recall 
depending on age at report, this method is particularly justified, be- 
cause it packs all the incidence data neatly into one grand summary. 
(For discussion of the critics’ comments see A-39.) No better method 
for overall comparisons seems to be available. 


32. Other devices 


1. KPM did some extensive sampling experiments on their data, 
with a view to discovering the sample size needed for the accuracy 
they desired. These experiments turned out to be almost valueless 
because KPM did not take account of the necessary statistical princi- 
ples (see A-19). 

2. The committee had an opportunity to inspect the KPM facilities 
on a visit to Bloomington, Indiana. We observed that the data sheets 
were neatly filled out, that the files were well kept, that requests for 
original data were usually met in a matter of moments, and that the 
office was well equipped for handling the extensive data with which 
KPM deal. 

3. The KPM volume was written while data were still being col- 
lected. Apparently KPM chose to use all the data on hand at the time 
a particular point was being analyzed (personal communication from 
KPM). Thus different tables have different totals, a source of annoy- 
ance to critics and users of the book. The reasons for this should have 
been pointed out by KPM. The additional interviewing was deliber- 
ately selective with an aim to strengthen weak segments (personal 
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communication from KPM). It seems to us that, if this strengthening 
was necessary for later analyses, it would have been worthwhile to add 
the new material to the early tabulations. This would also have in- 
creased comparability and avoided the problems raised by the exist- 
ence of many different sampled populations. 


CHAPTER VIII. TWO COMPLEX ANALYSES 


33. Patterns in successive generations 


In this chapter we discuss briefly two analyses by KPM which have 
attracted much attention. Our object is to give two specific illustra- 
tions of the kind of analysis which they chose to undertake, with 
comments on their competence. 

The first analysis was made by dividing the sample into two groups: 
those over 33 years of age at the time of interview, with a median age 
of 43.1 years, and those under 33 years at the time of interview, with a 
median age of 21.2 years. 

Our comments deal with three topics: (i) the statistical methodology 
employed (ii) KPM’s summary of their tables (iii) the general problem 
of inference from data of this type. 


34. Statistical methods 


In the comparisons, educational level and age at the time of the 
activity are held constant and in nearly all comparisons marital status 
also. The method used to compare the group means seems satisfactory 
except for some minor points, discussed in A-25, A-33 and A-43. 

It would have been helpful to present classifications of the older and 
younger groups according to other factors which might influence sexual 
activity, e.g., rural-urban background, religious affiliation, marital 
status at age 20 or 25. The two groups would not necessarily agree 
closely in these break-downs, for there has been a slow drift towards 
the towns, and perhaps a drift towards “inactive” rather than “active” 
religious affiliation. For interpretive purposes it is advisable, in any 
event, to learn as much as possible about the compositions of the older 
and younger groups. Some critics have claimed that the older genera- 
tion is “atypical.” 


35. KPM’s summary of their tables 


The data are presented in 8 large tables (98-105). As a statistician 
learns from experience, a competent summary of a large body of data 
is not an easy task. KPM give a detailed discussion of the accumulative 
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incidence data for each type of outlet, followed by a similar discussion 
of the frequency data. 

These detailed comments on what the data appear to show seem 
sound, except that on two occasions where the younger group showed 
greater sexual activity, KPM ignored or played down the difference 
between the two groups (Section A-45). 

Their general summary statement reads in part as follows: 


“The changes that have occurred in 22 years, as measured by the data 
given in the present chapter, concern attitudes and minor details of be- 
havior, and nothing that is deeply fundamental in overt activity. There has 
been nothing as fundamental as the substitution of one type of outlet for 
another, of masturbation for heterosexual coitus, of coitus for the homo- 
sexual, or vice versa. There has not even been a material increase or decrease 
in the incidences and frequences of most types of activity. ... 

“And the sum total of the measurable effects on American sexual be- 
havior are slight changes in attitudes, some increase in the frequency of 
masturbation among boys of the lower educational levels, more frequent 
nocturnal emissions, increased frequencies of premarital petting, earlier 
coitus for a portion of the male population, and the transferences of a per- 
centage of the pre-marital intercourse from prostitutes to girls who are not 
prostitutes.” 


Some critics have objected strongly to this statement, particularly 
the first paragraph, on the grounds that it gives a biased report by 


brushing aside the differences in activity, which are almost all in the 
direction of higher or earlier sexual activity by the younger group. 
The reporting does appear a little one-sided, in that the reader is en- 
couraged to conclude that the differences are immaterial, although 
KPM do not state what they mean by a “material” increase. On the 
other hand, the catalogue of differences, given at the end of the second 
paragraph above, includes all differences noted either by KPM or 
the critics, except for an increased homosexual activity in the younger 
group at educational levels 0-8 and 9-12. 


36. Validity of inferences 


Two objections have been made by some critics to any inferences 
drawn from a comparison of this type. The first is that the groups may 
not be representative of their generations. KPM have attempted to 
dispose of this objection, at least in part, by holding educational level 
and marital status constant. It might be possible to go further and hold 
other factors constant, or at least examine whether the samples from 
the two generations differ in these factors. But with non-random 
sampling the objection is not removed even if a number of factors are 





IR 1983 


Ission 


seem 
Owed 
rence 


. data 
of be- 
e has 
et for 
omo- 
ease 


1 be- 
cy of 
juent 
arlier 
| per- 
2 not 


arly 
; by 
the 


ugh 
the 
ond 
or 
ger 


STATISTICAL PROBLEMS OF THE KINSEY REPORT 705 


held constant, because one or both groups might be biased with respect 
to some factor whose importance was not realized. Various opinions 
may be formed as to the strength of the objection, but it can be re- 
moved only by the use of probability sampling accompanied by valid 
tests of significance. 

Secondly, in a comparison of this type, the older generation is describ- 
ing events which involve a much longer period of recall, with a possi- 
bility of distortion as events become distant. Further retake studies, 
if KPM can continue them for a sufficiently long period, may throw 
some light on the strength of this objection. 

The joint effect of these objections is to render the conclusions tenta- 
tive rather than definitely established. 


37. Vertical mobility 


This analysis (pp. 417-47) shows a degree of ingenuity and sophisti- 
cation which is not too common in quantitative investigations in soci- 
ology. The data are arranged in a two-way array according to the oc- 
cupational class of the subject at the time of interview and the occupa- 
tional class of the parents. KPM examine whether the pattern of sexual 
activity of the subject is more strongly associated with the parental 
occupational class than with that attained by the subject. They con- 
clude (p. 419) 

In general, it will be seen that the sexual history of the individual accords 
with the pattern of the social group into which he ultimately moves, rather 
than with the pattern of the social group to which the parent belongs and 
in which the subject was placed when he lived in the parental home. 

The most significant thing shown by these calculations (Tables 107-115) 


is the evidence that an individual who is ever going to depart from the 
parental pattern is likely to have done so by the time he has become adoles- 


cent. 


The amount of data which KPM present in this analysis is worth 
mention as evidence that they do not shirk work. Tables are given for 
7 types of activity. Three age groups are shown in each table. When 
we classify by occupational level of subject and parent, this leads to 
21 two-way tables. Five measures of the type of activity are given, so 
that a painstaking examination extends over 105 two-way tables. 

KPM appear to have paid most attention to the frequency data. 
Their task is to determine whether this shows a stronger association 
with the occupational class of the subject or of the parent. In reaching 
a verdict, they rely on judgment from eye inspection. By a similar eye 
inspection, we agree with their verdict as a descriptive statement of 
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what the data indicate, although different individuals might disagree 
as to how definitely their statement holds. Judgments made by one 
individual for the data on frequencies were that in 7 of the 21 two-way 
tables, association with subject and parent either was not present at 
all or looked about equal. In 9 it looked mildly more with the subject 
and in 5 it looked strongly more with the subject. 

It would be of interest to undertake a more objective analysis. Analy- 
sis of variance techniques are available for this purpose, although some 
theoretical problems remain. 

So far as interpretation is concerned, the principal disturbing factor 
is the possibility, which some critics have mentioned, that the subject’s 
reports of his activity are influenced by the social level to which he 
belongs at the time of interview. KPM maintain that attitudes towards 
different types of activity are strongly affected by the social level of the 
subject. Whether they change when he changes his social level would 
be interesting to discover. Something might be learned by retakes for 
subjects who had moved in the social scale. To obtain an abundant body 
of data of this kind will, however, be a slow and difficult process. 


CHAPTER IX. CARE IN INTERPRETATION 
38. Sample and sampled population 


In sample surveys, the inference from sample to sampled population 
is often relatively straightforward, although not trivial. We can usually 
set limits so that the statement “the sample agrees with the sampled 
population within these limits” has approximately the agreed-upon 
risk. (We may have to work fairly hard to set these limits correctly.) 
But we have always to remember, and usually must remind the reader 
steadily, that these limits are not infinitely narrow. 

KPM’s caution on page 153 (quoted in Appendix A, Section 48) is 
a caution, but it is not repeated. 

In general, their statements about small differences are more forth- 
right than we would care to make. 


39. Sampled population and target population 


When a respectable approximation of a probability sample is in- 
volved, the step from sampled population to target population is usual- 
ly short and the inference strong. Otherwise, the inference is often 
tortuous and weak. It depends on subject matter knowledge and intui- 
tion, and on other barely tangible considerations. These considerations 
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deserve to be brought to the reader’s attention, and to be discussed 


as best the authors may. 
This KPM did not do adequately. Their discussion of diversification 


(p. 92) and 100 per cent samples (p. 93) is only a beginning. 


40. Systematic errors of measurement 


Any quantitative study offers the possibility of systematic errors 
of measurement. It is generally agreed that these possibilities should be 
placed before the reader and discussed. 

In KPM’s study these possibilities concentrate on the difference 
between present reported and past actual behavior. KPM spent 
Chapter 4 on this question. Their discussion is generally good, except 
on some questions which arise in connection with generation-to- 
generation comparison (see Sections A-25 and A-44). 


41. Unsupported assertions 


We are convinced that unsubstantiated assertions are not, in them- 
selves, inappropriate in a scientific study. In any complex field, where 
many questions remain unresolved, the accumulated insight of an ex- 
perienced worker frequently merits recording when no documentation 
can be given. However, the author who values his reputation for ob- 
jectivity will take pains to warn the reader, frequently repetitiously, 
whenever an unsubstantiated conclusion is being presented, and will 
choose his words with the greatest care. KPM did not do this. 

Many of the most interesting statements in the book are not based 
on the tabular material presented and it is not made at all clear on what 
evidence the statements are based. Nevertheless, the statements are 
presented as if they were well-established conclusions. 


42. Some major controversial findings 


Some KPM findings about which much scientific discussion has cen- 
tered relate to: 


(i) stability of sexual patterns, 
(ii) homosexuality, and 
(iii) the effects of vertical mobility. 


In all these areas KPM have made forthright and bold statements. 
As discussed in more detail in Sections A-45 to A-47 (also see A-25), 
there are reasons for caution in every one of the three areas. 
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CHAPTER X. COMPARISON WITH OTHER STUDIES* 
43. Interviewing 


Good sex studies have been made using both the personal interview 
and questionnaire techniques. Given that just one technique is to be 
employed, KPM’s choice of personal interview seems necessary if 
illiterates or near-illiterates are to be sampled. At present, it is good 
practice in gathering this type of data to endeavor to have all subjects 
give information on as many relevant points of the study as possible. 
No study seems to have done better on this matter than KPM. 

Whether it is always good practice to standardize the questions 
asked is debatable. KPM did not do this and give telling arguments 
against the practice. Some other studies have standardized the ques- 
tions, both in personal interview and in self-administered question- 
naires, and they have included good arguments in favor of their pro- 
cedure. In training interviewers KPM seem to have gone to greater 
lengths (a year of training) in preparing for the specific interview used 
in the study, than any of the other personal interview studies. Informa- 
tion on training of interviewers is fairly hard to come by in all these 
studies. 

Given the choice of personal interview, it is not possible at this writ- 
ing to be logically certain whether the KPM technique is better or 
worse than that of the other interview studies, no matter whether one 
approves or disapproves of the tactics of a diagnostician or medical 
detective. Some discussion of how the KPM interview appeared to 
us is given in Appendix E. Numerous cross-checks on frequency and 
dates of occurrences appear within the KPM interview, while they 
seem to be lacking in most other studies. Setting aside points on which 
there is no evidence, KPM’s interviewing is as good as or better than 
that of the other studies reviewed. 





* The material in this chapter is our inference from the reviews supplied by W. O. Jenkins and 
presented in Appendix B. We have not personally read all the volumes concerned. The volumes are as 
follows: 

Bromley, Dorothy D., and Britten, Florence H. Youth and sex. New York: Harper and Brothers, 1938. 
Davis, Katherine B. Factors in the sex life of twenty-two hundred women. New York: Harper and Brothers, 

1929. 

Dickinson, R. L., and Beam, Lura A. The single woman. Baltimore: Williams and Wilkins Co., 1934. 
Dickinson, R. L., and Beam, Lura A. A thousand marriages. Baltimore: Williams and Wilkins Co., 1931 
Farris, E. J. Human fertility and problems of the male. White Plains, N.Y.: Author’s press, 1950. 
Hamilton, G. V. A research in marriage. New York: A. and C. Boni, 1929. 

Kinsey, A. C., Pomeroy, W. B., and Martin, C. E. Sexual behavior in the human male. Philadelphia: 

W. B. Saunders Company, 1948. 

Landis, C., et al. Sex in development. New York and London: Paul B. Hoeber, 1940. 
Landis, C., and Bolles, M. M. Personality and sexuality of the physically handicapped woman. New 

York and London: Paul B. Hoeber, 1942. 

Terman, L. M., et al. Psychological factors in marital happiness. New York: McGraw-Hill Book Co., 1938 
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44. Checks 


As for checks on the interviewing process, KPM unquestionably 
lead the field with 100 per cent samples, retakes, spouse comparisons, 
early vs. late groups, interviewer comparisons, and the pubic hair study. 
Some authors mention casual checks with no data supplied. Bromley 
and Britten compare interview and questionnaire results on different 
groups. Davis reports a study where 50 subjects were interviewed 
before and after questionnaire administration, and offers a breakdown 
py consecutive 100 questionnaires received. Dickinson and Beam’s 
two books speak of comparing verbal reports and physical examination 
results as a way of verifying the record rather than as a check—no 
records seem to be published. Farris’ comparison of reported vs. per- 
sonally recorded masturbatory rates omits the critical comparative 
information. Hamilton finds that different question wordings give 
different responses, but leaves the matter here. Landis and Bolles use 
several independent judges for evaluation of scales—but, instead of 
comparing their results, argue that agreement will be good because of 
experience and training. They do not compare normal with handicapped 
subjects. Landis checks with the psychiatric case history as a means 
of eliminating subjects with discrepancies, and gives data on the agree- 
ment of independent judges’ ratings. Terman offers spouse comparisons. 
When KPM’s checks are viewed with those of the other leading sex 


studies in mind, it is clear that a new high level has been established. 


45. Sampling 


All studies used volunteer non-probability samples. Some were drawn 
from more specifiable target populations than others. For example, 
Bromley and Britten drew exclusively from college volunteers, while 
Davis used mail-questionnaire respondents from lists of Women’s 
Clubs and college alumnae. Others used well-to-do patients, or clinic 
groups. Aside from KPM, Bromley and Britten is the only study that 
seems to have attempted to get nationwide geographic representation 
(we have omitted M. J. Exner’s 1915 study), while Davis has covered 
the eastern area, and Terman covers part of the California area. Al- 
though KPM’s sample is heavily charged with college students, a 
broader representation of social and educational levels is offered than 
in the other studies. All studies reviewed have special features which 
make generalizations to specific populations difficult. Certainly KPM’s 
sampling seems never worse and often better than that of the other 
studies, 





710 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


46. Analysis 


Most studies confined their analysis to simple descriptive statistics— 
percentages, means, and medians. A few added ranges, standard devia- 
tions, correlation coefficients, and attempted significance tests. About 
half used two-way breakdowns, usually on background characteristics, 
as a way of sharpening differences between groups. Three studies 
offered scales either based on judges’ evaluations (Landis, and Landis 
and Bolles), or scoring of batteries of items (Terman). KPM restricted 
the use of scales to occupational classification and homosexual-hetero- 
sexual rating. They added the accumulative incidence curve, the U. §. 
corrections, and extensively used fine-grained (high-order) breakdowns. 
In general, KPM’s analysis employed more devices and was more 
searching than the analyses offered by other studies. 


47. Interpretation 


We have already mentioned (33) that KPM are competent at the 
accurate and understandable verbal description of the meanings of a 
table whose entries are taken as correct. Some of the other authors 
have also done well, although the extent of their analysis is usually 
more limited. In inferring from sampled population to target popula- 
tion, all the studies are weak. The inferences left with the reader (if 
we are to judge) are much broader than the studies could possibly 
warrant. Every study has its own precautionary remarks to the effect 
that the reader must not extend the inferences beyond that of the 
population studied. Very little attempt is made to describe the target 
population, to help the reader with the step from sample to sampled 
population, or to remind him of sampling fluctuations. The precaution- 
ary remarks in the opening pages of a study are usually forgotten when 
the authors come to discuss matters of national policy, morals, legisla- 
tion, therapy, and psychological and sociological implications toward 
the end of their book. The reader must then be left with the inference 
that the findings apply on at least a national scale. Bromley and Britten 
are more forthright than most. They argue overtly that their volunteer 
college sample is a representative of all U.S. individuals of college age. 
Of the 10 studies considered, only two, Davis and Farris, seem to have 
consistently exercised due caution about generalization from sample 
to population and warnings to the reader. The last paragraph of the 
section entitled, “Description of Sample and Sampling Methods” 
in each review in Appendix B gives one reader’s opinion of the general- 
izations from sample to sampled population intended by the author. 

Our reviewer was not asked to gather data that would give us a way 
of comparing the extent of unsupported statements in the other stud- 
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ies with those of KPM, so this aspect of interpretation remains uncom- 
pared by us. It would be very interesting if someone would collect 
such information, not only in connection with the present work, but 
with regard to general scientific writing in various fields. This would be 
no small task. 


CHAPTER XI. CONCLUSIONS 
48. Interviewing 


(1) The interviewing methods used by KPM may not be ideal, but 
no substitute has been suggested with evidence that it is an improve- 
ment. 

(2) The interviewing technique has been subjected to many criti- 
cisms (see Section A-11), but on examination the criticisms usually 
amount to saying “answer is unknown,” or “KPM have not demon- 
strated how good their method is.” 

These conclusions can be summarized by saying that we need to know 
more about interviewing in general. 


49. Checks 


(1) The types of methodological checks considered by KPM seem 
to be quite inclusive. 

(2) A greater volume of checks—more retakes, etc. is desirable, as is 
more delicate analysis. (See Sections C-15 and C-18.) 

(3) The results of duplicate recording of interviews should be pub- 
lished. 

These conclusions can be summarized by saying that KPM’s checks 
were good, but they can afford to supply more. 


50. Sampling 


Given U. S. white males as the target population, our conclusions 
are that: 

(1) KPM’s starting with a nonprobability sample was justified. 

(2) It should perhaps already have been supplemented by at least 
a small probability sample. 

(3) Iffurther general interviewing is contemplated, and perhaps even 
otherwise, a small probability sample should be planned and taken. 

(4) In the absence of a probability-sample benchmark, the present 
results must be regarded as subject to systematic errors of unknown 
magnitude due to selective sampling (via volunteering and the like). 


51. Analysis 


KPM’s analysis is best described as simple and relatively searching. 
They did not use such techniques as analysis of variance or multiple 
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regression, but they brought out the indications of their data in a work- 
manlike manner. 

In more detail: 

(1) their selection of variables for adjustment seemed to be a reason- 
ably effective substitute for more complex analyses, 

(2) they gave several measures of activity (giving the reader a choice 
at the expense of more tables to examine), 

(3) they made essentially no use of tests of significance, but cited 
many standard errors (which were inappropriate for their cluster sam- 
ples), 

(4) they used U. S. Corrections and their (independently developed) 
accumulative incidence curve. More careful exposition of these devices 
would have been desirable. 

To summarize in another way: 


(i) they did not shirk hard work, and 
(ii) their summaries were shrewd descriptive comments rather than 
inferential statements about clearly defined populations. 


Their main attempt at inferences was a sample size experiment whose 
results (i) could have been predicted by statistical theory, (ii) were 
irrelevant to their cluster sampling. 

They continued to add new interviews without redoing earlier tabu- 
lations, thus producing an unwarranted effect of sloppiness in the book, 
although their records were kept carefully and in unusually good 
shape. 


52. Interpretation 


(1) KPM showed competence in accurate and understandable ver- 
bal description of the trends and tendencies indicated by their tables. 
In stating and summarizing what the sample seems to show, they 
were competent and effective. 

(2) Their discussion of the uncertainties in the inferences from the 
numbers in the tables to the behavior of all U. S. white males was brief, 
insufficiently repeated, and oftentimes entirely lacking. In instilling 
due caution about sampling fluctuations and differences between 
sampled and target populations, they were lax and ineffective. 

(3) Their discussion of systematic errors of reporting is careful and 
detailed (with the exception of some questions bearing on generation 
comparisons). 

(4) Many of their most interesting statements are not based on the 
tables or any specified evidence, but are nevertheless presented as 
well-established conclusions. Statements based on data presented, 
including the most important findings, are made much too boldly and 
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confidently. In numerous instances their words go substantially beyond 
the data presented and thereby fall below our standard for good scien- 
tific writing. 

53. Comparison with other studies 


In comparison with nine other leading sex studies, KPM’s work is 
outstandingly good. 

In more detail, 

(1) their interviewing ranks with the best, 

(2) they have more and better checks, 

(3) their geographic and social class representation is broader and 
better, 

(4) their volunteer non-probability sample problem is the same, 

(5) they used more varied and searching methods of analysis, 

(6) only two of the nine studies (Davis and Farris) were more care- 
ful about generalization and warned the reader more thoroughly 
about its dangers. 


Thus, KPM’s superiority is marked. 
54. The major controversial findings 


It is perhaps fair to regard these four as KPM’s major controversial 
findings: 

(1) a high general level of activity, including a high incidence of ho- 

mosexuality, 

(2) a small change from older to younger generations, 

(3) a strong relation between activity and socio-economic class, 

(4) relations between activity and changes of socio-economic class. 

Al! of these KPM set forth as well established conclusions. All are 
subject to unknown allowances for: 

(a) difference between reported and actual behavior, 

(b) nonprobability sampling involving volunteering. 

While their findings may be substantially correct, it is hard to set 
any bounds within which the truth is statistically assured to lie (see 


Appendix A, Section 4.) Once again, we wish to point out that the same 
difficulties are present in many sociological investigations. 


CHAPTER XII. SUGGESTED EXTENSIONS 
55. Probability sampling 
Appendix D discusses the advantages, possibilities and difficulties 
of probability sampling in some detail. 
In brief summary: 











714 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


(1) Costs and refusal rates together determine the wisdom of exten- 
sive probability sampling. 

(2) Information on costs and refusal rates is lacking. 

(3) Hence probability sampling should begin on a very small scale, 
say 20 cases. 

(4) Astep-by-step program, starting at such a scale, seems wise, and 
is recommended to KPM. 


56. Retakes 


While retakes showed high agreement on vital statistics, and moder- 
ately high agreement on incidence, the data presented in KPM for 
frequencies show considerably less agreement. The data do not make 
clear how much better a retake agrees with a take than with a randomly 
selected interview for another subject with the same age, religion, 
social class, etc. 

If the agreement is better, then retakes will provide evidence as to 
non-random agreement—evidence bearing on the much-discussed sub- 
ject of the constancy of recall. In addition, take-retake differences are 
clearly so large as to make retakes of two old subjects at least as valu- 
able as a take of one new subject in determining the average behavior 
of groups (see Section A-24). 

If the agreement is no better, then retakes will provide evidence 
that this was so, and every retake will be as valuable as a new take in 
determining the average behavior of groups. 

In our opinion 500 retakes would help the standing of KPM’s data 
more than 2000 new interviews (selected in the same old way). It would 
of course be important to determine and report the selective factors 
which influenced the selection of the retaken subjects. 


57. Spouses 


Separate interviews of husband and wife are a useful supplement to 
retakes, in that they supply the nearest approach to two independent 
reports of the same action, although the information is restricted for 
the most part to marital coitus, and is weakened by the possibility 
of collusion. In the book, KPM present comparisons for 231 paizs of 
spouses. 

In an expansion of this program, various elaborations could be sug- 
gested. The first objective should probably be to interview more pairs 
from the lower educational levels, in order that the agreement between 
spouses can be examined separately for different educational levels. 
As in the case of retakes, the data are not wasted so far as the main 
study is concerned, since they contribute both to the male and female 
samples. 
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xten- 


58. Presentation 


As the critics point out (Chapters VII-A, I-C), parts of the book 
cale, are hard to understand because of lack of clarity of presentation. In 
future editions, the following steps would remove the major ambigui- 

, and ties. 
(i) KPM should explain why the numbers of cases change erratically 
from table to table. In future publication it would be worth substantial 


der- effort to avoid these changes. 

| for (ii) Table headings and contents should be critically reviewed as to 

nake their lucidity. 

mily (iii) Worked examples of the calculation of U. 8. corrections should 

ion, be given. References under the tables to the variables used for correc- 
tion should be more precise. 

s to (iv) More discussion should be given, with numerical illustration, 

sub- of the meaning of accumulative incidence percentages. 

are (v) More information should be given about the questions asked, 

alu- with their variations, in the interview. Although this would be extreme- 

reer ly laborious to do for the complete interview, one or two blocks of re- 
lated questions might serve the purpose. For such a block, KPM might 

wae describe (a) the variations used in the statement of the questions (b) 

én the variations in the order of questions (c) the reasons for the varia- 
tions. An illustration of this type would give deeper insight into the 

ata logical structure of KPM’s interviewing technique and might go far 

uld to substantiate their claim (p. 52) that flexibility is one of the strengths 

— of their technique 


(vi) Several critics make a strong plea that more information be 
given about the composition of the sample (see Chapters I-A, I-C). 


to The specific items requested vary with the critic, and some would be 

ant a major undertaking both in preparation and publication. A minimum 

for that seems feasible would be to present a multiple classification of the 

ity subjects according to the following items at the time of interview: age, 

of marital status, occupation, educational status, religious affiliation, place 
of residence. In addition, more information is needed about the extent 

ig- to which special groups (e.g., those in penal institutions, homosexual 

irs groups) contribute to the tables. 

en 

Is. 59. Statistical analyses 

in 


I In Appendix C, a number of statistical analyses are outlined which 
as would be a useful contribution to the methodology of studies of this 
kind. The analyses would require expert statistical direction. 
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As has been pointed out, the standard errors presented by KPM 
are invalid, because they were computed on the assumption of random 
sampling of individuals. A method for calculating standard errors so 
as to take into account the actual nature of KPM’s sampling is given 
in Chapter II-C. These standard errors would allow a realistic appraisal 
of the stability of KPM’s means. They would indicate by how much 
the means determined from the present KPM sample are likely to vary 
from the means of a much larger sample of cases obtained by the KPM 
methods. 

KPM described orgasm rates in terms of per cent incidence and mean 
or median frequency. However, other mathematical functions of these 
variables may be more appropriate, leading to simpler statements of 
the results. Approaches for investigating this question, and the related 
question of the use of some combination of the variables, are suggested 
in Chapters ITI-C and IV-C. 

The question of applying adjustments to segment means has already 
been discussed (Section 17). A technique is presented (Chapter V-C) 
for reaching practical decisions on the appropriateness of adjustment 
and on the number of variables for which adjustment should be made. 


60. Relative priorities 


We give here our personal collective opinion as to how further effort 
on the male study might best be spent (we have not tried to evaluate 
priorities in comparison with the female study, or any other studies 
which KPM may contemplate). 

If the interviewer time which it would require were available, we 
believe that the effort required for the proposed probability sample 
would be worthwhile. 

So long as it did not interfere with the possibility of a probability 
sample, available interviewer time should be concentrated: 


on retakes when working in or near old areas. 
on husband-wife pairs when two interviewers are available. 


If the probability sample has already been ruled out, and if fewer 
interviewer months are available, then an attempt to retake a random 
sample of previous subjects would be most desirable, whenever possible, 
husband and wife being taken whenever either is retaken. 

Effort in the form of statistical analysis and presentation need not 
interfere with interviewing, and should be pressed to the extent that 
experienced and understanding personnel can be found. 





THE INVENTORY PROBLEM* 


J. LADERMAN, Office of Naval Research 
8. B. Lirraver, Columbia University 
Lionet Weiss, University of Virginia and Cornell University 


HIs article is expository, and is based on the two papers, “The 
Inventory Problem,” by A. Dvoretzky, J. Kiefer, and J. Wolfowitz, 
which appeared in the April 1952 and July 1952 issues of Econometrica. 
These papers are too advanced mathematically to be read by many 
of those to whom the results might be of interest. It is hoped that this 
paper will bring the new technique to the attention of those persons, 
both in government and private industry, who are responsible for mak- 
ing decisions affecting the amount of inventory to be held by their 
organizations. In the opinion of the present authors, great economies 
can result from the application of this new inventory control technique. 
The inventory problem can be stated very simply: it is to decide 
how much material to stock in preparation for an uncertain future. 
Both understocking and overstocking are costly, else there is no prob- 
lem. If overstocking is not penalized, such large stocks could be held 
that no conceivable future occurrence would deplete them; if under- 
stocking is not penalized, zero stocks could be held. The usual cases, 
where both understocking and overstocking are costly, are the ones of 
interest here. For exgmple, the proprietor of a restaurant, buying 
perishables for the day, will see them spoil if he buys too many, or will 
turn customers away unsatisfied if he buys too few, thus failing to earn 
potential profits and perhaps permanently losing some customers. Even 
if a merchant does not deal in perishables, overstocking may involve 
carrying costs which include such items as rent, insurance, deprecia- 
tion, loss of interest on capital invested, etc. As a less homely example, 
an army would certainly be heavily penalized for being caught short of 
ammunition, but since there are other important items needed by an 
army, it would be possible to stock too much ammunition’ at the 
sacrifice of other military items. 
The reader can no doubt think of other cases, closer to home, where 
a balance must be struck between overstocking and understocking. 
The purpose of this article is to describe a method of striking this bal- 
ance so as to minimize the losses to be expected from taking the risks 
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of overstocking or understocking, which are unavoidable when one 
has to provide for an uncertain future demand. 

As a first step, we describe a simple but impo:tant concept, the 
“schedule of losses.” The schedule of losses is a schedule which shows 
what the loss is for any given combination of stock held and future de- 
mand. A profit is regarded as a negative loss. The schedule of losses can 
often be simply expressed by a mathematical formula. For example, 
suppose a newspaper vendor buys y papers from the publisher for 3 
cents a copy, sells d copies for 5 cents a copy, and resells the unsold 
copies to the publisher for 1 cent a copy. Then his loss in cents is 
—2d+2(y—d), because he makes 2 cents profit on each of the d pa- 
pers he sells and loses 2 cents on each of the (y—d) papers he returns, 
where of course the number sold to customers, d, cannot exceed the 
number bought from the publisher, y. Thus for any possible combina- 
tion of numbers of papers bought and sold, we can compute the loss 
incurred by the vendor. The schedule of losses in this case is typical 
of many situations where unsold stock depreciates in value (perhaps 
even becomes worthless). 

The schedule of losses includes losses arising from the four following 
categories: 


a) Negative of the profit from a transaction or other gain from the completion 
of a mission. 


b) Carrying costs which are the losses arising from the stocking of the com- 
modity. 

c) Losses due to depletion which arise when the demand exceeds the available 
supply. 

d) Ordering costs which are the costs involved in processing an order to 
change the inventory level. 


In the above newspaper example the schedule of losses involved only 
items a and b. The —2d represented the negative of the profit from the 
sale of the d papers and the 2(y—d) represented the loss due to ob- 
solescence of the papers (a carrying cost). 

It is almost inconceivable that a person responsible for making deci- 
sions would not have a fairly good idea of what the schedule of losses 
is for his case. In the absence of any knowledge at all about the sched- 
ule of losses, it is difficult to imagine on what rational grounds the size 
of inventory can be set. From now on, we shall assume that the sched- 
ule of losses is known, at least approximately, and shall then describe 
a method of choosing the size of inventory to be held. 

First we discuss a particularly simple case where stock can only be 
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ordered or returned to the supplier (a negative order) at the beginning 
of a time interval, and only the stock held after the order is placed can 
be used to meet the demand that will arise during the interval. We as- 
sume that there is instantaneous delivery of the order, and also that the 
future demand is completely known. It is this last assumption that 
makes this case so simple, for once the schedule of losses is known and 
the future demand is known, we simply place an order of a size that will 
minimize the loss. Thus, in the case of the newspaper vendor above, it 
is clear that the number of copies he should buy from the publisher is 
the number of copies he will be able to sell (i.e. the total demand). For 
he loses 2 cents on each unsold paper, but makes 2 cents on each paper 
he sells. And in any other case where the future demand is known, it 
is 2 simple matter to choose an order that will minimize the loss. 

It is when the future demand becomes uncertain that we meet diffi- 
cult and more realistic cases. First we discuss how we shall interpret 
“uncertain future demand.” We will not interpret this as meaning com- 
plete lack of knowledge about future demand, nor, obviously, do we 
mean that we know exactly what the future demand is going to be. 
“Uncertain future demand” to us shall mean something between com- 
plete lack of knowledge and complete certainty ; namely, that future de- 
mand is a chance variable with a known probability distribution. In 
other words, future demand may have any one of several values, with 
known probabilities. The reader will perhaps inquire under what cir- 
cumstances we would be justified in regarding demand as a chance 
quantity. We will not try to give a complete answer here, but will note 
that demand may depend on various factors of a chance nature, there- 
by making demand itself a chance quantity. For example, demand may 
depend upon the weather, which itself is frequently considered as 
though it depends on chance. 

To make these ideas more specific, let us take the case of the news- 
paper vendor discussed above. Suppose he is located in a suburban rail- 
road station, and that each morning there are 200 customers who reach 
the station early enough to buy a paper from him. Another 50 potential 
customers arrive at the station in a bus. If the bus arrives early, each 
of the 50 buys a paper, but if the bus arrives late, none of the 50 has 
time to buy a paper. Let us assume that the bus arrives late half of the 
time, and there is no way of telling beforehand on any day whether or 
not the bus will be late. Then it is clear that the demand for the ven- 
dor’s papers on any given day is a chance variable which can take the 
value 200 with probability 4, or 250 with probability 4. This means 
that, in the long run, on } of the days the demand will be for 200 pa- 
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pers, on the other 4 of the days, for 250 papers. How many papers 
should the vendor buy from the publisher in this case? 

The vendor’s loss will be a chance variable because it will depend 
upon the demand, itself a chance variable. The probability distribution 
of the loss will depend upon the number of papers the vendor buys 
from the publisher. (We remind the reader that the distribution of a 
chance variable is simply a list of the possible values of the chance vari- 
able with their respective probabilities.) Roughly speaking, it is clear 
that the size of the vendor’s purchase from the publisher should be such 
as to make the probabilities of large losses small. This rather vague 
statement, however, is not explicit enough to enable us to decide just 
what the size of the vendor’s purchase should be. Many different inter- 
pretations can be made, but throughout the remainder of this paper we 
are going to choose the size of the order to make the expected value of 
the loss as small as possible. A justification for using the procedure 
which minimizes the expected value of the loss is that such a procedure 
is the best one to use if one wishes to minimize the average loss in the 
long run. In the next paragraph we shall review briefly the concept of 
expected value. 

If one takes many observations on a chance variable, the average of 
the observations will ordinarily tend to some number. That number is 
called the expected value of the chance variable. More precisely, for 
our purposes the expected value of a chance variable may be defined 
as the weighted average of all the values the chance variable can take 
on, with the probability of each value as its weight. For example, sup- 
pose a chance variable can take on the values, 1, 2, or 3 with probabili- 
ties of 4, 4, 4 respectively. Then the expected value is given by }(1) 
+4(2)+4(3) =2. Clearly, if many observations are made on this chance 
variable, about 3 of them will have the value 1, about } of them will 
have the value 2, and the remaining ones will have the value 3, so the 
average of all the observations will usually be close to 2. 

It is clear from the discussion of the preceding paragraph that choos- 
ing the size of the order to minimize the expected value of the loss is 
not an unreasonable procedure, since the smaller the probabilities of the 
larger losses, the smaller the expected value of the loss. Also, if such a 
policy is applied over and over, the average loss will usually be less than 
that obtained from any other policy. 

Before we actually compute the size of the order that will minimize 
the newspaper vendor’s expected loss, we shall discuss a possible objec- 
tion to our whole procedure. The practical definition of probability is in 
terms of the “long-run.” That is, when we say that the probability of 
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an event is $4, we mean that in a long series of experiments or trials, 
the event will occur about $ of the time. Therefore, it might be asked, of 
what use is probability theory, whose statements in practice refer to the 
long run, in a problem like that of the newspaper vendor, who is con- 
cerned with his loss in one particular day? In many cases, no answer to 
this objection is necessary, since we will be dealing with a long series of 
trials. Even our newspaper vendor will presumably be trying to sell 
his papers day after day under the same circumstances, so that min- 
imizing his expected loss for one day is equivalent to minimizing his 
average loss per day over all the days he will be selling papers. In those 
cases where there will not be a long series of trials, an answer to the 
objection might be that even though, in practice, the probability of an 
event in one trial is the proportion of times it would occur in a long 
series of identical trials, even if only one trial were made, the higher 
the probability of the event, the greater would be our confidence that 
the event would occur in that one trial. For example, if one were told 
that he will be executed if he draws a red card from a deck, he would 
certainly prefer to draw the card from a deck of 51 black cards and 1 
red card rather than from a regular deck of cards. 

Returning now to the newspaper vendor, we want to find how many 
newspapers he should buy from the publisher in order to minimize his 
expected loss, where his schedule of losses (from above) is—2d+-2(y—d) 
=2y—4d, and the number of customers who will seek to buy a paper 
is a chance variable with possible values of 200 or 250, each with 
probability 4. If the vendor buys 200 or fewer papers from the pub- 
lisher, all the copies will be sold to customers, none resold to the pub- 
lisher, so the loss will be minus twice the number bought from the 
publisher, namely —2y. From this it is apparent that no fewer than 
200 copies should be bought from the publisher, for the loss decreases 
as the number bought increases from zero to 200. Also, it is clear that no 
more than 250 papers should be bought, for the total in excess of 250 
will surely have to be resold to the publisher at a loss of 2 cents each. 
So the proper number to order to minimize the expected loss is between 
200 and 250 inclusive. Then the loss will be either 2y—4(200) with 
probability 4 (that is when the bus is late), or else 2y—4y = —2y with 
probability } (this is when the bus is not late, making d=y). Therefore 
the expected loss when the number bought from the publisher is be- 
tween 200 and 250 is equal to 4(2y—800) +3(—2y) which equals minus 
400 cents. Thus it turns out that the expected loss is the same for any 
order between 200 and 250 and it is greater for any other order. Just 
to be specific, let us agree that whenever more than one order will 
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achieve the minimum expected loss, we will choose the smallest of the 
orders. Thus in this case we would order 200 papers. 

We shall now give another example to illustrate how this method can 
be applied to an actual inventory problem of the Navy. There are cer- 
tain rather expensive items (some costing over $100,000 each) known 
as “insurance spares” which are generally procured at the time a new 
class of ships is under construction. These spares are bought even 
though it is known that it is very unlikely that any of them will ever 
be needed and that they cannot be used on any ship except those of 
that particular class. They are procured in order to provide insurance 
against the rather serious loss which would be suffered if one of these 
spares were not available when needed. Also, the initial procurement 
of these spares is intended to be the only procurement during the life- 
time of the ships of that class because it is extremely difficult and costly 
to procure these spares at a later date. The present policy is to order 
quantities of these spares according to the following schedule: 


Total number of Number of spares 
items installed ordered 
1-4 1 
5-50 3 
51-100 3 
over 100 4 


This particular ordering policy is based on the judgment of personnel 
familiar with the expected usage rate of such technical spares and also 
experienced with procurement policies of the Navy. However, they 
will admit that the construction of such a table is largely an intelligent 
guess and that the quantities shown to be ordered cannot be justified 
objectively. On the other hand, the procedure to be given below, based 
on the previous discussion, will lead to an objective method of con- 
structing such a table. Moreover, by using this procedure the total ioss 
over a long period of time will ordinarily be less than that obtained from 
any other ordering policy. 

Let us suppose N ships are being constructed of a certain class con- 
taining an item of the type described above, for which spares cost P 
dollars each. Let p; represent the probability that exactly 7 spares will 
be needed as replacements during the lifetime of the N ships; that is, 
~1 is the probability that exactly one spare will be needed, pe is the 
probability that exactly two spares will be needed, etc. Let us also as- 
sume that the probability of 5 or more spares being needed is zero. Let 
L dollars be the loss (usually quite large) suffered for each spare that is 
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needed when there is none available in stock. In obtaining the schedule 
of losses we shall neglect all the smaller losses and include only the cost 
of the spares (which become worthless if never used) and the depletion 
loss occurring when spares are needed but not available. Then for any 
d<y, there is no loss from depletion, so the schedule of losses would 
simply be the number bought multiplied by the unit price which is yP. 
For d=y, the schedule of losses would be yP+(d—y)L because (d—y) 
is the number of spares needed but not available, and L is the loss in- 
curred for each one short. Hence, the schedule of losses, which is de- 
noted by W(y, d), is 
W(y, d)=yP fordsy 
=yP + (d — y)L for dy. 


Now let us get the expected value of the loss, EW(y, d), for each value 
of y from y=0 to y=4. These are the only values of y which need be 
considered because if y is greater than 4, the loss will surely be greater 
than the loss for y=4. For y=0 we have that d2y for all possible 
values of d, hence 


EW y, d) = 0-P + [Prob. d = 0](0 — 0)L + [Prob. d = 1](1 — 0)L 
+ [Prob. d = 2](2 — 0)L + [Prob. d = 3](3 — 0)L 
+ [Prob. d = 4](4 — 0)L 
= (pi + 2p. + 3p3 + 4p.) L for y = 0. 
In a similar manner, we get 
EW y, d) = P + (pe + 2ps + 3p4)L fory = 1 
= 2P + (ps + 2m)L fory = 2 
= 3P+ mL fory = 3 
= 4P for y = 4. 
Now for any given values of P, L, and pi, it is a simple matter to tabu- 
late the values of EW(y, d) for the different values of y in order to de- 
termine which value of y gives the smallest expected loss. For example, 
suppose P=$100,000, L=$10,000,000, p:=.04, po=.01, ps=.001, pu 
= 0002, and the probability of 5 or more spares being needed is zero 
(hence po=.9488), then 


for y=0, EW(y, d) = (.04+.02-+.003-+.0008) (10,000,000) = $638,000 
for y=1, EW(y, 2) =100,000-+ (.01-++.002-+.0006) (10,000,000) 
= $226,000 
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for y=2, EW(y, d) =200,000+ (.001+.0004) (10,000,000) = $214,000 
for y=3, EW(y, d) =300,000+ (.0002) (10,000,000) = $302,000 

for y=4, EW(y, d) =$400,000. 


Since under the above assumed conditions the expected loss is smallest 
when y= 2, the best ordering policy is to order 2 spares. 

The newspaper vendor’s problem and the Navy inventory problem 
that we have discussed, simple as they are, contain the most important 
elements of all the other problems we shall discuss. We now give a more 
general formulation of essentially the same problem. 

Suppose that at the beginning of a certain time period we have a 
stock, z, of a certain commodity, which we shall call stock before or- 
dering. During the period a certain demand, d, for the commodity will 
be observed. This demand is a chance variable whose probability dis- 
tribution is known to us. The probability distribution of demand may 
depend upon the stock before ordering and the size of our order, but 
once these two quantities are known, the distribution of demand is 
known. To prepare for this demand, we have the privilege of ordering 
more of the commodity from the producer, ordering no more, or re- 
turning some, but this must be done at the beginning of the period. No 
orders can be delivered or stock returned once the period has started. 
We shall assume that there is no time lag in delivery from or to the 
supplier. Our problem is to find the quantity we should order, which 
will be denoted by y—z. Thus y is the quantity on hand at the start 
of the time period but after ordering. A return of goods to the producer 
is a negative order. In all practical cases there will be certain limits on 
the size of orders that can be placed, and our solution of the problem 
will take this into account. Our schedule of losses tells us what our loss 
is for each possible combination of values of the stock before ordering, 
order, and demand. In the newspaper vendor’s problem and in the 
Navy inventory problem the stock before ordering was zero which is 
why the z did not appear in the schedule of losses. 

In general we will choose that size of order such that the expected 
loss is minimized. The size of order that minimizes the expected loss 
will depend upon the size of stock before ordering. An “ordering policy” 
is a schedule showing what size of order to use for any given size of 
stock before ordering. The ordering policy is the complete solution to 
our problem, for it tells us just what to do in any given circumstances. 

A-simple example will illustrate the ideas we have been discussing. 
Suppose the proprietor of a newsstand has z copies of a monthly maga- 
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gine on hand at the end of the 10th of the month, for which he has al- 
ready paid. The wholesale magazine dealer is coming the next morning 
and will take back as many magazines as the vendor desires to return, 
paying the vendor 4 cents each, or he will sell the vendor any number 
of additional copies at 6 cents each. The vendor charges his customers 
15 cents per copy. The wholesaler will not return until the end of the 
month, at which time he will buy back all unsold magazines at 2 cents 
per copy. We assume that the number of people who will attempt to 
buy a copy from him during the remainder of the month will be either 
4 or 5 with probabilities of 3 and } respectively. Also, if the vendor 
has any unsold copies at the end of the month, he will definitely sell 
them back to the wholesaler. What should the ordering policy be in 
this case? The schedule of losses which shall be denoted by W(z, y, d) 
because it depends on the value of z, on the ordering quantity, y—z, 
and on the demand, d, is obtained in the following way: 

When dz2y, y copies will be sold which will bring the vendor l5y 
cents. In addition, if ySz, then (x—y) copies are returned to the 
wholesaler which brings the vendor 4(z—y) cents making 15y+4(z—y) 
his total gain, and if y2z, the vendor purchases an additional (y—z) 
copies making his total gain 15y—6(y—2). The negative of these gains 
are the losses, yielding 


W(x, y, d) = — [l5y+4(¢—y)] = —lly—4¢ ford2z=y,ysz 
= — [l5y —6(y¥—2)] = —9y-—6r ford2=y,y2-x. 
When dSy, we have a situation similar to the above except that d cop- 
ies are sold by the vendor instead of y copies and (y—d) copies are re- 
turned to the wholesaler at the end of the month at 2 cents each, yield- 
ing 
W(x, y,d) = — [15d + 4(x — y) + 2(y — d)] 
= — 13d — 4x + 2y ford sSy,ySz2z 
= — [15d — 6(y — z) + 2(y — d)] 
— 13d — 62 + 4y ford Sy, y 2 7. 
Now we need the expressions for the expected value of the loss, 
EW(cz, y, d), from which for any given x, we will be able to find the y 
(hence the order quantity, y—x) which will minimize the expected 
loss. For y$4, the demand is certainly equal to or greater than y, 


and since in this case W(z, y, d) does not depend on the value of d, we 
have 
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EW(z,y,d@)=-—l1ly—4¢ forysS4,y<z 
= — 9y — 62 fory S$ 4,y 2x. 


Clearly the expected value of the loss is minimized in this case when y 
is made as large as possible, which is 4. On the other hand, for y=5, the 
demand is equal to or less than y, and we have 


EW(z, y, d) = — 13[3(4) + 3(5)] — 4x + 2y 
— api — 42 + 2y fory25,y <2 

= — 13[3(4) + 3(5)] — 6x + 4y 
= — 231 — 6x + 4y fory = 5,y 27. 


Here the expected value of the loss is minimized by making y as small 
as possible, which is 5. Hence the ordering policy is certain to call for 
y=4 or y=5, which was fairly obvious anyway from the fact that the 
demand could be only 4 or 5. However, the above expressions for 
EW (cz, y, d) are needed to determine when y=4 and when y=5. Sup- 
pose «<4, then the expected loss for y=4 is —36—6z and for y=5 it is 
—1+4—6z which is greater than —36—6z; henec the best policy when 
x4 is to order up to 4. Now suppose +25, then the expected loss for 
y=4 is —44—4z and for y=5, it is —+44+— 42 which is less than —44 
—4z; hence the best policy now is to return all those over 5. To sum- 
marize the above, the best policy for the vendor is to buy up to 4 if he 
has less than 4 on hand, or to do nothing if he has 4 or 5, or to return 
any excess over 5 on hand. 

Next we shall discuss a more general problem—the case where there 
are several time intervals with carry-over of stock from one time in- 
terval to the next time interval. Here we are given a certain number 
of time intervals, and stock may be ordered or returned at the begin- 
ning of any of the intervals, but at no other times. Unused stock at the 
end of an interval may be kept for use in the next interval, with addi- 
tional stock ordered from the supplier if desired, or some or all of it 
may be returned to the supplier. Only the stock available at the begin- 
ning of an interval may be used to supply demand arising in that inter- 
val, and for the present we assume instantaneous delivery of orders 
from the supplier. The total loss is the sum of losses suffered in each of 
the intervals, and the different intervals may have different loss sched- 
ules. Furthermore, the loss in any time interval may depend on the 
whole “past history”—defined as all the stocks, orders, and demands 
in all the preceding intervals—as well as on the stock before ordering, 
order, and demand of the interval itself. Also, the probability distribu- 
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tion of the demand that will be observed in an interval may depend on 
the past history as well as on the stock before ordering and order of the 
interval itself. Once the past history and the stock before ordering and 
order of the interval are known, the probability distribution of the de- 
mand that will be observed in the interval is completely known. Pre- 
sumably we are interested in minimizing the expectation of the present 
value of the total loss, and therefore will apply the proper discounting 
factors to the losses incurred in the various intervals in order to get the 
present values of those losses. These discounting factors can be as- 
sumed to be incorporated into the loss functions for the various inter- 
vals. 

A complete ordering policy in the case of many intervals must specify 
just how large we should make the order at the beginning of each in- 
terval in the light of the knowledge we possess at the beginning of the 
interval (i.e. our knowledge of the stocks, orders, and demands in the 
preceding intervals and the stock before ordering of the interval itself). 
In general the order we place at the beginning of any interval will de- 
pend upon the past history, and different past histories will require dif- 
ferent orders. In constructing an ordering policy, it is important to 
remember that once we have reached the beginning of an interval, the 
losses we have suffered in the preceding intervals are now beyond our 
control, so it is only the expected losses from the remaining intervals 
that we worry about. 

For the case of many intervals, we now give a method of constructing 
an ordering policy that makes the expected loss as small as possible. 
First we specify how much to order at the beginning of the last inter- 
val. This is simple, for there is only one interval left to worry about, 
and we want to make the expected loss in that one interval as small as 
possible. At the beginning of the last interval we know all that has hap- 
pened in the preceding intervals, and therefore we know the schedule 
of losses and the probability distribution of demand in the last inter- 
val. Thus we have essentially the problem of making the expected loss 
in one interval as small as possible, and we have discussed this prob- 
lem of one interval above. In other words, the problem of how much to 
order at the beginning of the last interval is a simple one-interval prob- 
lem, which we know how to solve. Thus, for all conceivable past his- 
tories we can make up a schedule showing how much to order at the 
beginning of the last interval. 

Now we specify how much to order at the beginning of the next-to- 
the-last interval. Once we know the past history before this next-to- 
the-last interval, then for any particular order we place at the begin- 
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ning of the next-to-the-last interval, we can compute the total expected 
loss in the two last intervals. The reason we can do this is that we have 
already specified what order we are going to place at the beginning of 
the last interval, under any conceivable circumstances. The total ex- 
pected loss in the two last intervals will, of course, depend on the order 
placed at the beginning of the next-to-the-last interval, so we simply 
pick that order that makes the total expected loss in the two last inter- 
vals as small as possible. This order will in general depend on the past 
history before the next-to-last interval. So far, then, we have specified 
how much to order at the beginning of the last interval and at the be- 
ginning of the next-to-last interval. 

Next we specify how much to order at the beginning of the third 
interval from the end. Once we know the past history before that inter- 
val, then for any particular order we place at the beginning of the in- 
terval, we can compute the total expected loss in the last three intervals. 
The reason is that we have already specified how much we will order 
at the beginning of the last two intervals under any conceivable cir- 
cumstances. We place that order at the beginning of the third interval 
from the end that makes the total expected loss in the three last inter- 
vals as small as possible. Thus, we have specified how much to order 
at the beginning of the last three intervals. 

And so we work our way back, interval by interval, until we have 
specified how much to order at the beginning of the first interval. Once 
we reach this point, our problem is solved, for we know how much to 
order at the beginning of each interval to make the total expected loss 
as small as possible. 

An example with two time intervals will now be given to help clarify 
the discussion just completed. Let us go back to the last example with 
the newsstand proprietor and add another time interval to the problem 
by assuming that the proprietor has two possible ordering times, on the 
mornings of the 1st and 11th of the month. On the Ist he will have no 
stock on hand before ordering, but on the 11th he may have some left 
over from the quantity he purchased on the Ist and failed to sell. Let 
us also assume that all the conditions given in the previous example re- 
main unchanged and that the demand during the Ist interval is either 
10, 14, or 18 with probabilities of 4, 4, } respectively, and that the de- 
mand during the 2nd interval is independent of the demand during the 
1st interval. What we need to determine is how many copies the pro- 
prietor should order on the Ist and what ordering policy he should 
use on the 11th. No doubt there are vendors, particularly those who are 
reluctant to take risks, who would buy only 14 on the Ist in order to 
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avoid the risk of suffering losses from returning unsold magazines to 
the wholesaler. Let us now determine what the best policy really is. 
It will be convenient to introduce the following notation: 
y: =number of magazines purchased on the Ist. 
y:=number of magazines in stock after purchasing on the 11th. 
d, =number of people attempting to buy a magazine during the Ist interval. 
d;=number of people attempting to buy a magazine during the 2nd interval. 
2, =number of magazines in stock at the end of the 1st interval. (This is equal 
to x: —d, if y:>di; otherwise it is 0.) 


To find the best ordering policy, we make believe that we have 
reached the morning of the 11th and therefore know y; and d;. We now 
need to find the value of y2 which makes the expected loss in the 2nd 
interval as small as possible with the stock on hand being z2. But the 
best policy on the 11th has already been worked out in the last example 
of the one interval case, so that policy is the best one to use on the 11th. 
For this 2nd interval we found that y2 should be 4 if z2<4 and the 
expected value of the loss is —36—6z2, and ye should be 5 if S25 and 
the expected value of the loss is —2$4—4z2. All that remains to be found, 
is how many the vendor should buy on the Ist so that the expected 
value of the losses from both intervals is minimized. Clearly he should 
buy at least 14 because he is certain to sell at least 10 during the Ist 
interval and at least 4 during the 2nd interval. Also, he should buy at 
most 18 since the demand during the Ist interval cannot exceed that 
quantity. 

If d,2%:, then the vendor would sell all y; copies at a profit of 9 
cents each. His schedule of losses during the 1st interval would then 


‘be 


W(n, d;) = 941 for d; = 41. 


He would then end the Ist interval with no stock on hand, z2.=0, and 
the optimal policy on the 11th would be to buy 4 copies giving an ex- 
pected loss of —36—622= —36 for the 2nd interval. Thus the total loss 
from both intervals is 


= 9x1 — 36 for d = Y1. 


If d: Sy, the vendor would sell only d; copies at 15 cents each and he 
would have bought y; copies at 6 cents each making the loss during the 
Ist interval 


W(y, di) = — [15d, — 6y:] = — 15d: + by for di S 41. 
He would then end the Ist interval with a stock of 2:.=y,—d;, and we 
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know that the best policy then would be to make y2.=4 if z2<4 and to 
make y2=5 if x2=5. The expected loss from the 2nd interval in these 
two cases is given by 


— 36 — 62, = —36-—6(y41-—d:) fory — d,s4 


and 


— 42 -— 4m = —4H-4(y-—d:) form — d25. 
The total loss for both intervals for y,2d; would then be given by 


= 15d; + 6x" 7 36 = 6(y1 - d;) = — 36 — 9d; 
ford Sy Sa+4 


and 


—15d; + 6y1 = Apt > 4(y acai d;) = aft — lld; + 2x1 
for y, 2 d, + 5. 


We now want to compute the expected value of the total loss for » 
ranging from 14 to 18 which are the only values we need consider. We 
note that for d,=10, we have y,2d,+5 for all the values of y: except 
when y:=14 in which case we have d,;Sy;Sd,+4. For di=14, we al- 
ways have d:Sy,54d,+4, and for d,=18, we have y:Sd;. Hence the 
expressions for the expected value of the total loss are 


4(— 36 — 9(10)) + 4(— 36 — 9(14)) + 3(— 126 — 36) = — 153 
for y: = 14 


and 


4(— 2444 — 11(10) + 2x) + 3(— 36 — 9(14)) + 3(— 9m. — 36) 
= — 2991 — ty, for 15 Sy: S 18. 


By taking y= 18 we find that the expected total loss is —29¢4— (18) 
= — 160; which is the smallest we can make this expected loss. There- 
fore the best policy for the vendor is to buy 18 copies on the 1st and to 
use the policy previously given on the 11th. 

A further generalization of the inventory problem is to allow time 
lags in the delivery of orders. In other words, an order placed at the 
beginning of an interval will not arrive until a certain number, 7’, of 
intervals have passed. Otherwise the problem is the same as the type 
just discussed, and the method of solution is almost the same. Obvi- 
ously the last order will be placed (7’+1) time intervals before the end, 
since no order placed later than that will arrive in time to be of any use. 
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We choose the size of the last order to make the total expected loss in 
the remaining (7'+1) intervals as small as possible. The order we choose 
will depend upon the past history known at the moment the order is 
placed, and this past history will include quantities already ordered 
but which will arrive in the future. Once we have found the proper 
size for the last order, for any particular next-to-last order we can com- 
pute the expected loss in the remaining (7'+2) intervals. We choose 
the size of the next-to-last order which minimizes the expected loss in 
the remaining (7'+2) intervals. This size will, of course, depend upon 
the past history known at the moment the order is placed. And so, 
interval by interval, we work our way back to the first order. 

Another generalization is to allow simultaneous demands for several 
different types of items, necessitating the stocking of more than one 
commodity. The demands may be interrelated in any way, and some 
commodities may be partial or complete substitutes for others. It is 
assumed that, given a particular set of demands and a particular set of 
commodities on hand, we know how to use the commodities most ef- 
fectively in trying to satisfy the demands. The schedule of losses in 
this case tells us what our loss is for any given set of demands and any 
given combination of commodities available, assuming that the com- 
modities available are allocated most effectively. In this case the prob- 
ability distribution of demand is a joint probability distribution of the 
different types of demand, which gives us the probability of observing 
any particular combination of demands, and an ordering policy must 
tell how much of each commodity to order at each stage. In computing 
an ordering policy for this case, the principles are the same as in the 
single commodity case, but the details are more troublescme, and for 
a large number of items it may be practically impossible to carry out 
the computations. 

As a last generalization we have the case where the probability dis- 
tribution of demand is not completely known—we may know only that 
the distribution is of a certain type. Then, for any given ordering pol- 
icy, there will not be merely one expected loss, but a whole set of ex- 
pected losses, one for each possible distribution of demand. How then 
shall we compare two different ordering policies, since one may be bet- 
ter for some distributions of demand and worse for others? One method 
of doing this is to find, for each ordering policy, the maximum expected 
loss over all possible distributions, and then choose that ordering policy 
with the smallest maximum expected loss. This ordering policy is called 
a “minimax” policy because it minimizes the maximum expected loss. 

In the following example illustrating the minimax policy, some mathe- 
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matical terminology and methods will be used which may be unfamiliar 
to some readers and can be omitted by them without too much loss. Let 
us go back to the very first example in this paper in which a newspaper 
vendor buys papers in the morning at 3 cents per copy, sells them to 
customers at 5 cents per copy, and resells unsold copies to the supplier 
at 1 cent per copy; but now we shall assume that the distribution of the 
demand, d, is normal, with standard deviation 50 and mean between 
1000 and 2000, the exact value of the mean being unknown. We want 
to find how many papers the vendor should buy in the morning accord- 
ing to the minimax policy. 
As before, the vendor’s schedule of losses is given by 


Wy, d) = 2y — 4d ford <y 
= — 2y ford = y. 


From this it is seen that for any order quantity, y, the vendor’s loss 
will be greatest when d is smallest. But as the mean of the normal dis- 
tribution of d decreases, small values of d become more probable and 
large values of d become less probable. Therefore, for any given y, the 
expected loss is greatest when the mean of the normal distribution of d 
is as small as possible, namely, 1000. Hence, if we choose the y that 
minimizes the expected loss when the mean of the distribution of d is 
1000, this will be the minimax y (the y with the smallest maximum ex- 
pected loss). For suppose we use some other y, then the maximum ex- 
pected loss using this other y will occur when the mean of the distribu- 
tion of d is 1000, and this maximum will be greater than if we had used 
the y that minimizes the expected loss when the mean of the distribu- 
tion of d is 1000. To find the minimax y, let F(x) denote the normal 
cumulative probability distribution function with mean 1000 and 
standard deviation 50, and f(x) the corresponding density function. 
Then the expected loss (assuming the mean of d is 1000) is equal to 


" tf(tdt 
Qy — 4—————_- | F(y) — 2y[1 — F 
y ro (y) — 2y[ (y)J 
= — 2y — 4 f “ese + 4yF(y). 


Differentiating with respect to y, we get [—2+4F(y) ]. This derivative 
is zero for F(y)=1/2, negative for F(y)<1/2, positive for F(y)>1/2. 
Therefore we should take y so that F(y) is equal to 1/2, which means y 
should be equal to 1000. This is the minimax y. 
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CENSUS TRACTS AND URBAN RESEARCH* 


Donatp L. Fo.tey 
University of California (Berkeley) 


ELECTED statistics have been reported on a census tract basis by the 

Census Bureau for the past four decennial censuses. The number of 
tracted cities has increased during this period from 10 to 72. In short, 
the census tract statistical reporting system has become a well developed 
source of information. 

At recent Census Tract Conferences most of the discussion has cen- 
tered on applied uses of tract data. Thus representatives from business, 
market research [1], city planning and various social and health agen- 
cies have reported on putting census tracts to work. This paper supple- 
ments these earlier reports (1) by examining how census tract statistics 
have facilitated urban research of a more theoretical sort, (2) by dis- 
cussing some methodological problems that have been encountered, 
and (3) by suggesting ways in which census tracts can most effectively 
implement such research in the future. The focus here will tend toward 
pure rather than applied, and toward university rather than business or 
civic agency research. 


THE USE OF TRACT STATISTICS IN “PURE” URBAN RESEARCH 
In general, the tract reports issued by the Census Bureau have cen- 


tered around certain population and housing characteristics, areally 
assigned according to home address. Each category of information has 
usually been reporied in frequency distribution form, from which se- 
lected summary statistical measures (e.g., percentages or averages) can 
be computed. 

Research use of tracts is by no means limited to data reported by the 
Census Bureau. This is one of the intriguing assets of the tract report- 
ing system. Numerous and important additional types of information 
have been assembled by local agencies and researchers, although prob- 
ably more for applied than for pure research purposes [2, 3]. Thus, we 
have had tract statistics for juvenile delinquency [4, 5], receipt of wel- 
fare care [5, 6], births and deaths [5, 7], illness [5], mental illness [8], 
suicide [9], residential mobility [6, 10], etc. 

So much for an introductory look. Let us now turn to university re- 
search. In which academic fields have traci data been used in the con- 
duct of pure research? In general, the research most directly promoted 
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has been that dealing with the differential characteristics of urban resi- 
dential subareas within large cities, being conveniently subsumed under 
the label, human ecology [11]. Urban sociologists, urban geographers, 
and land and real estate economists have been the most active devotees, 
while various other social scientists and scattered professionals with 
pure research interests, in such fields as municipal administration and 
business, have been peripherally linked. 

Discouraging as it may seem to proponents of the census tract sys- 
tem, a sober appraisal leaves the impression that there has been but 
limited pure research use of tract statistics and this mainly in the field 
of urban sociology. Urban geographers have relied on their own map- 
ping and descriptive skills and have generally shunned the compara- 
tive, quantitative methodology that would most logically provide a 
receptive context for using census tract data. Some social scientists, 
notably certain real estate economists, have placed greater reliance on 
census tabulations by city blocks than by census tracts [12]. In political 
science and in other branches of economics there seems to have been 
virtually no research use of the census tract system. 

What research patterns have been employed in adapting tract ma- 
terial to pure research use? An initial distinction here is between those 
studies where census tract statistics have provided the central data and 
those researches where tract figures have been used (although less spec- 
tacularly) in the selection of study districts [13] or in furnishing statis- 
tics of relatively minor importance. 

It would seem fruitful to identify six main ways in which tract sta- 
tistics have been used, viewed in methodological terms. These different 
patterns are not mutually exclusive; two or more may be interwoven 
within the same study. 

1. Descriptive use in which the differential incidence, by tract, of a 
single factor is reported. In this pattern the incidence variations can 
usually be conveniently summarized in map form [2], using what we 
may term an ecological map. Some comprehensive reports have in- 
cluded a series of such maps, reporting both census collected and lo- 
cally assembled statistics. Among the most ambitious of such reports 
are those for Minneapolis [14], Seattle [15], Cleveland [7], and Rochester 
[5, 16]. In some studies of very large cities, tracts have been combined 
to form concentric zones or sectors, with incidence rates reported ac- 
cordingly [4, 8, 17]. 

2. Descriptive use in which the cross-cutting of two or more separate 
incidence patterns is reported. In map form, this use involves either the 
comparison of two or more of the single factor maps, as prepared in use 
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(1), or the preparation of a single map in which the cross classification 
of factors is shown with the aid of an appropriate legend [2, 18]. This 
step characteristically precedes the somewhat more sophisticated uses 
(4), (5), or (6). 

3. Time-series use in which changes, by tracts, are reported for stated 
periods of years. It is common here to summarize the findings in map 
form with legends indicating percentage increases or with time-series 
graphs fitted into the various tracts. Where statistics have been re- 
ported by concentric zones for some of the largest cities, various other 
forms of graphic analysis have been used, as in the Chicago studies of 
population succession by Cressey and Ford [19, 20]. 

4. Analysis of relationships, utilizing what has been termed ecological 
correlation [21]. Here the variables are summary measures, by census 
tracts. Thus, one can correlate per cent foreign born and median school 
years completed. In this case nativity status and education are not 
correlated directly, person by person, as in individual correlation. Usu- 
ally, in fact, in ecological correlation we do not know this information 
on @ person to person basis. Studies of this type have been conducted 
in Chicago [4, 8, 22], St. Louis [6, 23], and other cities [5, 24]. 

5. The interpretation of individuals’ characteristics in terms of the 
general social environment of the tract. In this case the former emerge 
from the specific study while the latter is available in the form of pre- 
viously published tract statistics. Faris and Dunham [8], for example, 
utilized this design to demonstrate that mental illness rates were higher 
for Negroes and certain other groupings in areas (combinations of tracts) 
not primarily populated by their own members. 

6. Use in statistical index form, each index presumed to represent a 
cluster of factors. Thus, average rental [7] and median education [25] 
have been promoted as indices of socio-economic status. A challenging 
recent attempt to develop statistical indices is the work by Shevky and 
Williams using Los Angeles census tract data [18] in developing three 
indices: for social rank (roughly socio-economic status), for urbaniza- 
tion (a complex of factors relating to type of family life), and for segre- 
gation (the residential concentration of minority groups). Based on the 
alternate ways in which these three indices can be related to each 
other, the authors have suggested a typology of residential areas. Alter- 
nate segregation indices have also been suggested by other researchers 
[26, 27, 28]. Kendall and Lazarsfeld have presented a stimulating dis- 
cussion of the various types of indices usable at a tract level according 
to the alternative logical ways by which they relate to direct charac- 
terizations of the individuals included [29, pp. 187-196]. 
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SOME METHODOLOGICAL PROBLEMS 


The research use of census tract data has involved a series of ecologi- 
cal and statistical assumptions, some of which have been reexamined 
during recent years. It should be recognized that the most vigorous ex- 
pansion of the census tract system tended to coincide with the phe- 
nomenal rise of the ecological “school” of social research. By now, the 
sociologist’s intellectual honeymoon with urban ecology is over and he 
is faced with the problem of settling down and living with this ecologi- 
cal approach. Such scholars of ecology as Hollingshead and Hawley [30] 
have in recent years identified theoretical difficulties inherent in “clas- 
sical” ecology and have indicated considerable skepticism regarding 
the future utility of spatial analysis, narrowly conceived. 

Let us examine some of the more specific assumptions that have been 
implicit in ecological research using census tract data: 

1. It was assumed during urban ecology’s early years that the large 
city was divided into “natural areas.” There was some belief that cen- 
sus tracts could be so established that they would coincide closely with 
these natural areas. Thus, internal homogeneity was sought and as- 
sumed for each tract [2, 31, 32]. The utility of the natural area concept 
has since been questioned by Hatt (33, 34] and the usefulness of data 
from non-homogeneous tracts has been challenged by a number of re- 
searchers [15, Appendix B; 35]. Myers, for example, in a recent study 
concluded that of New Haven’s 28 census tracts “10 tracts are homo- 
geneous to a remarkable extent; seven are less homogeneous; while the 
remaining 11 are heterogeneous” [36]. 

2. With Burgess’ important concentric zone construct, it appeared 
likely that general principles of urban spatial patterning would emerge, 
embodied in an ecological theory of urban structure. It has become in- 
creasingly evident, however, that at best alternative constructs must 
be admitted, such as the Hoyt sector theory and Davie’s insistence on 
the industrial pattern’s primacy. A more pessimistic view concludes 
that for many cities historical or topographic factors have had so per- 
vasive an influence as seriously to limit the predictive value of the 
broader principles. So while some cities (Chicago, St. Louis, Rochester 
[37]) tend to uphold much of Burgess’ and/or Hoyt’s theories, other 
cities (Boston [38], Pittsburgh, New York, Flint [39]) have more com- 
plex patterns. We may eventually need to introduce a typological sys- 
tem such as Shevky’s that will be less geared to grand principles and 
more to identifying certain types of urban areas in whatever overall 
pattern they take. Where Burgess’ scheme has not proved applicable, 
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the use of concentric mile zones, gradients, and similar ecological tech- 
niques tend to lose much of their utility. 

3. In many research projects it has been assumed that ecological in- 
dices were valid measures of certain social phenomena. In the study of 
juvenile delinquency, for example, the number of boys brought before 
a juvenile court or other agency (and expressed as a rate) has been used 
as an index of delinquency [4]. In the 1940 Census enumeration, the 
number or per cent of dwelling units “needing major repair” was avail- 
able as an index of housing condition.' In 1950 a “dilapidated” cate- 
gory was introduced. But we have had relatively little systematic vali- 
dation of these indices. The work by Schmid in this connection is im- 
portant. Using 1940 tract statistics from 20 medium-sized cities, he 
examined the degree to which a single index, such as educational level 
or rental level, is a valid measure of a larger complex of factors [25]. 

4. There has been some tendency for researchers rather uncritically 
to accept census tract statistics as reliable. There are conditions, how- 
ever, under which one should recognize that a sampling error may be 
present, particularly where the population base for the tract is small. 
This problem was recognized rather early in the development of the 
census tract system by various statisticians [40, 41, 42, 43], but it is not 
certain that all other users of tract statistics have heeded the cautions. 
Now in the 1950 census tract reports the problem has been reopened by 
the Census’ reliance on a 20 per cent sample for some nine published 
tract tabulations. This has resulted in such potentially important sta- 
tistical indices as years of schooling and family income. now being sub- 
ject to sampling error.? 

5. In the impressive series of studies that have used ecological cor- 
relation it has been assumed that correlations demonstrated meaning- 
ful interrelations of factors. Certain scholars in the 1930s [44, 45] and a 
recent vigorous article by Robinson [21] have pointed to serious statis- 
tical difficulties implicit in ecological correlation. Robinson concludes 
that “ .. . the only reasonable assumption is that an ecological correla- 
tion is almost certainly not equal to its corresponding individual cor- 
relation. [21, p. 357]. These critics have thus shown not only that eco- 
logical correlations run higher than individual correlations, but that 
the fewer the ecological areas, the higher the correlations. Hence a cor- 





1 As a matter of fact, this index did not prove to be consistently valid when used in research in 
St. Louis. This was apparently related to the subjectivity involved in its enumeration, 

2 This author is indebted to Professor Calvin Schmid for his emphasis on this problem. After 
Schmid’s methodological research tended to validate the use of median schoot years completed as an 
important index {25}, we now find that in the 1950 census reports the utility of this statistical indicator 
is somewhat reduced. 
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relation for City X based on 20 large districts will turn out to be larger 
than one based on 85 smaller districts, say, census tracts. Menzel, on 
the other hand has pleaded the case for ecological correlation where it 
is clearly understood that the characteristics being correlated are 
meaningfully interpretable in areal as well as (or instead of) in individ- 
ual terms [46]. 

6. In most earlier ecological research a rather static approach was 
assumed. Hence, to study urban structure it was necessary to have 
statistics that related only to the individual in his tract of residence at 
the time of the enumeration. No information, for example, was pro- 
vided on home-work or home-shopping spatial relations. With rare 
exceptions (Cleveland statistics for several years during the 1930s 
[10]), we have lacked information on intertract residential mobility 
within a metropolitan region. We have had no usable information as to 
residents’ association memberships and psychological identifications. 


PROMISING FUTURE USES OF CENSUS TRACT DATA 


It now seems appropriate to summarize what appear to be some of 
the most fruitful continuing uses for census tract statistics in pure ur- 
ban research. For in spite of the skeptical tone in which a number of 
the above points have been phrased, it is apparent that the tract re- 
porting system, when judiciously utilized, fills a striking need and cer- 


tainly deserves to be maintained. It is far more economical for Census 
reporting and more convenient for a variety of research applications 
than is reporting on a block basis. For the largest cities it provides a 
workable unit by which statistics can also be assembled for even larger 
areas or districts. 

An initial recommendation is that researchers in such academic fields 
as geography, political science, and social psychology be “educated” to 
the potential research adaptations of the census tract system. For ex- 
ample, the author recently overheard a political scientist admitting 
ignorance of census tract data, when questioned by a fellow sociologist. 
Nor had this political scientist heard of the recent study by Salmon and 
Olds of St. Louis voting behavior [23]. This scholar in the field of po- 
litical behavior showed considerable interest in the fact that tract sta- 
tistics could often be combined into ward statistics making possible 
ecological correlations between voting behavior and various social 
characteristics. 

A second suggestion is that census tract data may have their great- 
est general research value in providing rough ecological profiles. It 
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would be misleading to promise too much in the name of census tract 
statistics. They do not, for example, offer the areal refinement available 
from block data. Then, too, tract statistics are typically encumbered 
with certain limitations inherent in their functioning as statistical in- 
dices. There would seem to be a continuing need for guidance to poten- 
tial users. 

A third proposal flows from the second: that the use of tract statistics 
be integrated with other research approaches. An analysis of tract in- 
formation or the plotting of tract statistics on work maps can some- 
times be helpful at exploratory levels of research. Statistical profiles of 
particular sections of a city provide an excellent backdrop for non- 
quantitative case-study types of analyses. In Stephan’s words (dating 
from the mid 1930s), “Census tract research will probably be most ef- 
fective when considered not as a method of study complete in itself 
but as one step in a sequence of investigations” [39, p. 166 Suppl.]. 

Fourth, ecological correlations should be used only if it is clearly un- 
derstood that they tend to relate characteristics of areal units and that 
they are not adequate substitutes for individual correlation. If, for ex- 
ample, a researcher wants to study the correlates of the incidence of 
mental illness, he should recognize the methodological alternative of 
directly exploring the background characteristics of persons who are ill. 
The researcher should also take into account the effect of tract or areal 
unit size on the magnitude of the resulting ecological correlation. 

Fifth, there is a continuing need for ingenuity in introducing new 
types and forms of tract information. At the University of Miami, 
Wolff has been developing a technique for forecasting population by 
census tracts [47]. With many of the largest cities now having a back- 
ground of three or four decades of census statistics, such analysis of in- 
ternal population trends may become increasingly feasible. 

Under the sponsorship of the Social Science Research Council, the 
Pacific Coast Committee on Community Studies (Leonard Broom, 
Chairman) is currently preparing a research memorandum that will 
include several methodological contributions [48]. Schmid has been re- 
fining an approach whereby the Guttman scaling technique may be ap- 
plied to census tract data in an attempt to type residential areas. 
Robinson, Broom, Shevky, and Bell have all been engaged in further 
developing and testing areal typologies. These researches will be in- 
cluded in the Committee’s memorandum. One other recent West Coast 
attempt at developing urban subcultural areas is Wann’s research at 
the University of California [49]. 
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There would seem to be a continuing need for more measures, on a 
tract basis, of residential mobility and various cummuting and activity 
patterns. The brilliant theoretical study of residential mobility by 
Stouffer [50] was only possible because of Green’s unique assembly of 
intertract residence shifts [10]. If it were possible to replicate these 
statistics in other cities or to devise similar cross tabulations, by tracts, 
on home-to-work or on home-to-other-activity movements, our under- 
standing of daily population movements and of dependency on com- 
munity facilities could be enhanced. The coding of certain information 
by tract of employment or of shopping might be a helpful variant. 

And, finally, it seems appropriate to stress the need within each large 
city for effective communication among researchers so as to maximize 
the chances that data and methods from one study will have by-prod- 
uct value for succeeding studies. Base maps, street indexes, and certain 
arrangements for filing and interchanging data should be provided. 
The highly ingenious punched card system developed for St. Louis by 
Olds [51], although built around the city block as the basic unit, may 
have certain applicability on a tract or a block-and-tract basis for other 
cities. 
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ON A PROBABILITY MECHANISM TO ATTAIN AN 
ECONOMIC BALANCE BETWEEN THE RE- 
SULTANT ERROR OF RESPONSE AND 
THE BIAS OF NONRESPONSE 


W. Epwarps DEMING 
New York University 


The author postulates a probability mechanism for the 
simultaneous production of the bias of nonresponse and for the 
variance of response. The nonresponse arises from a graded 
series of classes of the members of the universe to be 
sampled. The classes range from an impregnable core of no 
possible response, on up to a class of complete response. 
Nonresponse arises from two sources, not at home, and re- 
fusal. Refusals are of two kinds, permanent and temporary. 
The variation in the amount of time spent at home, and the 
variation in the firmness of the temporary refusal, produce the 
graded series of classes. The bias of nonresponse arises from 
the variation of any characteristic from one class to another. 
The variance of response arises from the variation of any 
characteristics from one member to another within a single 
class, and from the random variation in the number of re- 
sponses therefrom. 

An increase in the size of the initial sample or a more 
efficient method of selection will decrease the variance of 
response, but will have no effect on the bias of nonresponse. 
Successive recalls, on the other hand, decrease the bias of 
response, and are more effective than an increase in the size 
of the sample or a more efficient method of selection in de- 
creasing the root-mean-square error which arises from both 
nonresponse and from the variation of response. 

The results show that without recalls, it is hazardous to 
put any confidence in the result, no matter how big the sample, 
even when the variation in the measured characteristic is only 
two-fold from the class of lowest response to the class of 
highest response. 

With the levels of response assumed here (taken from aver- 
age urban experience), and with an estimate formed by 
summing up the initial call and the recalls, the first two recalls 
effect together about a 50% reduction in the initial bias of 
nonresponse. Further recalls continue to be productive. In 
fact, with this method of estimation, each recall added to a 
sampling plan, even to six recalls, actually increases the 
amount of information obtained for each dollar expended on 
interviewing. 

Even with three recalls, and with only a two-fold variation 
from the class of lowest response to the class of highest re- 
sponse, an initial sample bigger than the equivalent of from 
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300 to 500 binomial cases in any one subclass is ineffective and 
uneconomical. The apparent precision of a bigger sample is a 
delusion, as with bigger samples the bias of nonresponse will 
eclipse the error of sampling unless there are 4, 5, or more 
recalls. An attempted “complete count” is no exception and 
often represents an extreme waste of effort. 

For high accuracy, @ plan that uses the ordinary method of 
estimation by combining the initial attempt and the recalls 
must support 4, 5, or 6 recalls, along with an initial sample 
equivalent to from 800 to 1500 binomial cases. 

For any proposed survey, calculations based on rough ad- 
vance estimates of the constants that appear in the formulas 
will predict to a useful degree of approximation the biases and 
the variances to be expected from various types of plans. Fig- 
ures on costs will then point out which plan is most economical, 
of those that are possible, for the attainment of a prescribed 
accuracy. 

Where extremely high accuracy is required, the Politz plan 
with 2000 or more binomial cases becomes competitive in cost 
with a survey that depends on recalls. In any case, the 
Politz plan has the advantage of speed and of being able to 
produce results under circumstances wherein recalls are im- 
possible (for example, listening to a radio program). 

The proposed mechanism provides a theory of bias to sup- 
plement the theory of sampling. It indicates the possibility of 
new and more efficient methods of estimation than the simple 
combination of the initial attempt and the recalls, as it will 
provide a rational basis for extracting more information from 
the recalls. It will also point out, for any particular method 
of estimation, what empirical information will be helpful in the 
planning of the efficient allocation of effort amongst the initial 
sample and the recalls. 


THE AIM OF THIS RESEARCH 


ONRESPONSE in a& survey is devastating and discouraging, whether 

the survey be by mail or by interview. In careful survey-practice, 
efforts have been made in many directions to reduce it. One usual 
solution is to find ways to build up the initial response. An additional 
solution is to call on the nonresponses, and to call and call. The first 
recorded systematic plan for putting pressure on a sample of nonre- 
spondents appears to have been carried out by Maurice Leven! in 1934. 
Substitution does not help: it is only equivalent to building up the 
size of the initial sample, leaving the bias of nonresponse undiminished, 





1 Maurice Leven, The Incomes of Physicians (Chicago, 1932); pp. 12 and 13. Mr. Stanley Legergott 
of Washington called my attention to this work. With regard to the ineffectiveness of substitution, see, 
for example, Cochran, Sampling Techniques (Wiley, 1953), p. 302. 
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Hansen and Hurwitz? found the optimum fraction of recalls to reach 
the minimum sampling variance for a fixed total cost, the estimate being 
formed, as usual, by pooling the initial responses and the recalls. Birn- 
baum and Sirken* apparently sought to minimize the mean square 
error that arises from both the variance of the responses and from 
failure to obtain interviews for any reason, including the permanent 
refusals (this I judge from their “nonresponding groups responding yes” 
—not clear to this author). Houseman‘ presented new results on the 
total bias that may arise from different classes of nonresponse. A new 
approach, in surveys conducted by interviews, is the Politz plan,® 
in which only the temporary refusals require recalls, as the correction 
for people not found at home while the interviewer is in the area is 
made by classifying the respondents according to the chance of finding 
them at home, and then by weighting the responses accordingly. 

It turns out that the bias of nonresponse is probably so serious in 
many if not most surveys that the specification of the number of recalls, 
and the adjustment of the original size of the sample to permit either 
the use of the Politz plan or the requisite number of recalls to balance 
the bias of nonresponse against the variance, and to stay within the 
allowable budget, are an essential part of sample-design where the aim 
is to produce as much information as possible per unit cost. 

The purpose of this paper is (a) to study the evidence produced by 
a proposed mechanism that will give rise to a calculable variance, to 
a calculable bias of nonresponse, and to a calculable cost; (b) on the 
basis of this mechanism to make a determination of the number of re- 
calls that are required to reach a desired accuracy at minimum cost. 
The allocation of the effort between the initial sample and the recalls 
is as important as the usual theory for calculating a sample-size. 

3 Morris H. Hansen and William N. Hurwits, “The problem of nonresponse in sample surveys,” 
Journal of the American Statistical Association, vol. 41, (1946), pp. 517-29. 

* Z. W. Birnbaum and Monroe G. Sirken, “Bias due to nonavailability in sampling surveys,” Journal 
of the American Statistical A tation, vol. 45 (1950), pp. 98-111. “On the total error due to noninter- 
view and to random sampling,” International Journal of Opinion and Attitude Research, vol. 4: pp. 179- 
91. Cochran in his Sampling Techniques (Wiley, 1953) gives on page 296 an excellent summary of Birn- 
baum and Sirken’s results. 

4 Earl E. Houseman, “Statistical treatment of the nonresponse problem,” Agricultural Economics 
Research, vol. v (1953), pp. 12-19. 

5 The Politz plan was under discussion as early as 1945 in conversations between Mr. Polits and 
this author. Experimental work thereon commenced in 1946 in the Alfred Polits research organization, 
in which the weighting became routine through various simplifying procedures. Some theory and 
application were presented in a joint article by Alfred Politz and Willard R. Simmons, “An attempt to 
get the not-at-homes into the sample without call-backs,” Journal of the American Statistical Associa- 
tion, vol. 44 (1949), pp. 9-31. 

H. O. Hartley described what is essentially the Polits idea in a discussion of a paper that had been 
read by Yates at a meeting in London (see Frank Yates, Journal of the Royal Statistical Society, vol. 


cix (1946), p. 37 in particular), but Hartley made no mention of experimental work either accomplished 
or intended. 
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A further purpose of the paper is to compare the results and the costs 
of recalls with the alternative Politz plan. 


CRITERION FOR THE OPTIMUM PLAN 


We now define the root-mean-square error. The criterion to be 
adopted here for the optimum plan is that it shall deliver a prescribed 
mean square error at minimum cost. The root-mean-square error (to be 
abbreviated r-m-s error hereafter) of any plan of survey will by defini- 
tion denote the hypotenuse of a right triangle, one leg of which is the 
bias of the nonresponse that arises from the plan, and the other leg of 
which is the standard error of the plan (see Fig. 1). Different plans 
will have different triangles. By definition, the criterion for the opti- 
mum plan is that it shall give a shorter hypotenuse than any other plan 
will give for the same cost; or, alternately, a plan is optimum if it, 
among all possible plans, will deliver a prescribed length of hypotenuse 
at the lowest cost. One plan is “better” than another if it will yield a 


The standard error 
of response 











The Dias of nonresponse 


Ficure 1. Any plan of survey will possess a bias of nonresponse and a standard 
error of response. The right angle addition of the two forms the root-mean-square 
error of the particular plan. 


shorter hypotenuse than the other, for the same cost. There are a 
number of nonsampling errors in all surveys, whether complete or 
sample.® The bias of nonresponse is only one of them. It exists, of course, 
in complete counts as well as in samples. In fact, the conclusions to 
be reached at the end will point to some drastic re-orientation of the 
effort expended on complete counts. Both the bias of nonresponse and 
the error of sampling exist in sample surveys. These are the two errors 
that within any particular framework of design of sampling, inter- 
viewing, and questioning, are direct functions of the size of the sample 
and of the number of recalls. 





* A list of such errors with discussion is contained in Chapter 2 of Deming’s Some Theory of Sampling 
(John Wiley, 1950); and in an article entitled “On errors in surveys,” American Sociological Review, ix 
(1944) 359-69. 
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As one seldom knows the resultant magnitude of all the non- 
sampling errors, and as they vary from one survey to another, the most 
sensible magnitude to aim at for the r-m-s error of the combination of 
sampling and of nonresponse (the hypotenuse in Fig. 1) will vary like- 
wise. One might aim at a r-m-s error of 7% in one survey, at 10% in 
another, and at 20% in another. Even with unlimited expenditure to 
reduce the r-m-s error to very low proportions, other errors will still 
be present unless funds are diverted to reduce them also. 


QUANTIFICATION OF THE PROBLEM 


The probability mechanism or model will now be described. The 
population to be sampled will be divided into six classes, according to 
the average proportion of interviews that will be completed success- 
fully out of 8 attempts. The classes will be designated by 0, 1, 2, 4, 
6, 8 to denote 0, 1, 2, 4, 6, 8 interviews completed, on the average, out 
of 8 attempts. These figures will often appear as subscripts to various 
other symbols. Six classes will be sufficient : more classes would not alter 
the results enough to warrant the extra labor. 

We assume that under the conditions specified for any particular 
survey, failure to obtain an interview may arise from a multiude of 
causes, which are manifest as not at home and refusal. We assume that 
people that refuse are of two kinds, those that give permanent refusals 
and those that give temporary refusals. People that give permanent 
refusals will never respond to any kind of treatment (they are a part 
of Class 0 defined more explicitly later). People that give temporary 
refusals are the kind that will refuse sometimes but will grant inter- 
views at other times or to other interviewers. An example of a tem- 
porary refusal is a case where the wrong interviewer called, or the right 
one called at the wrong time—woman bathing the baby, indisposed, 
family at dinner, etc. An interview might have been obtained with 
better luck in timing, or better luck in the selection of the interviewer. 

Class 0 contains the stubborn core of permanent impregnable re- 
fusals, plus the people who are never at home, gone to Florida, etc., 
or who are drunk when you do finally find them, or who turn out to 
be incapacitated otherwise and can not possibly give meaningful an- 
swers. At this moment we may note that the magnitude of this class 
varies widely, dependent on the type of information called for by the 
survey, and on the procedure of getting it. In a census, when people 
are away, or refuse, or are incapable of giving information, a good 
share of the required information can usually be obtained from neigh- 
bors, and is, although information on income must usually in such 
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cases be left unanswered. Thus Class 0 in a count of the number of 
inhabitants only is doubtless well below 1%, being reduced by the 
cooperation of neighbors. But in surveys whose express purpose js 
income, expenditures, savings, medical history, the neighbors are un- 
able to help, and Class 0 is bigger. I assume it to be 5% in the calcula- 
tions to be presented here. 

At the other extreme is Class 8, the people who 8 times out of 8 are 
at home and answer the questions. Moving inward from the soft outer 
shell (Class 8) toward the impregnable core, we encounter layers of 
increasing density. In Classes 6, 4, 2, 1 are the temporary refusals plus 
the people who are not home all the time. In Class 6 an interviewer will 
be successful at finding the respondent at home and in getting an inter- 
view, on the average, 6 times out of 8; in Class 4, 4 times out of 8; etc, 

Thus, we have not merely responding units and nonresponding units. 
Neither have we merely an overall proportion of response nor of non- 
response, but rather, response and (except for Class 8) nonresponse 
from each of several classes. We have not a mean value of some char- 
acteristic for the responses and some other value for the nonresponses; 
instead, each class possesses @ mean and a variance. We are concerned 
with the cumulative results from all classes. 


THE PATIENT MEAN 
We define the “patient mean” as 


8 8 
DL pias Dy pide 
—. main (1) 


=z Di ” 


1 





wherein a; is the mean value per sampling unit of some particular 
characteristic (rent, number of people employed, or something else) 
in Class 7, and p; is the proportion in this class. The patient mean will 
be the datum from which we reckon the biases in later calculations, 
and the unit in which we shall measure the bias and the root-mean- 
square error of any plan. It is the result of calling back patiently ad 
infinitum on all the people in Classes 1, 2, 4, 6, 8. The members of 
Class 0 will also be included in the recall because in practice we have 
no way of separating them out; but as they yield no response, they 
contribute nothing to the patient mean. 








ER 1953 


ber of 
y the 
OSE is 
e un- 
lcula- 


8 are 
Outer 
Ts of 
plus 
r will 
nter- 
; ete, 
nits, 
non- 
onse 
har- 
18€S ; 
rned 


(1) 


ular 
Ise) 
will 


uve 
ley 








ERROR OF RESPONSE AND BIAS OF NONRESPONSE 749 


THE INITIAL SAMPLE (ATTEMPT I) 


The treatment will be simplified by the assumption that the initial 
sample is the mere drawing of n names from a list of N names (the 
frame). A more complex plan will cause no important modification in 
the conclusions with respect to the necessity for recalls, nor with 
respect to the number of recalls required for the most economical plan. 
It will not modify seriously the comparison with the Politz plan. It 
will, however, change the absolute figures on cost, but these are not the 
aim of this study; they are auxiliary only. By further assumption the 
frame will be so large compared with the sample that the multinomial 
term 

n! 
Po Do™Pi™'p2™ + + > De" (2) 
No!ny!ne! ee ns! 
gives the probability that in the initial sample (Attempt I), there will 
be n; names in Class 7. n is the size of the initial sample. n; is a random 
variable; p; and n are constants, satisfying the equations 





Dm = 7 (3) 
8 
p> P= 1. (4) 


If the sample (n) is as great as 10 per cent or more of the frame, 
the variances and the biases to be computed should be reduced ap- 
proximately by the factor 1—n/N, in practice this reduction will be of 
negligible importance. ; 

When the returns from the initial call come in, we form from them 
the numerical average for some particular characteristic and denote 
it by z(I). According to the particular mechanism postulated, the com- 
position of x(I) will be the fraction 


Sum of all the numerical values in the responses of Attempt I 5) 
Number of responses in Attempt I ; 





z(I) = 


If we were able to separate the returns by class, this would appear as 


D Rex [Here and hereafter, sums will run over all classes (5a) 


2(I) = “> R,; except 0, unless indicated otherwise ] 


wherein R; represents the number of responses from Class 7, and 2; 
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represents the mean of the R; responses. Both R; and 2; are random 
variables. Their expected values are 





Ex; = a; (6) 
ER; = nap; (7) 
where 
a 
= ry ° (8) 
The variance of x; will be 
Ci 1 — rip; 
Var x; = (1 + 22), (9) 
NT pi NTPs 


wherein o; is the standard deviation of the particular characteristic 
in Class 7. In what follows we shall drop terms in 1/n?; hence we shall 
have no further use for the term (1—7;p;)/n7ip; in the last equation. 

The quantity z(I) in Equation 5 is a random variable. Under the 
assumed probability its expected value will be 


E(I) = 7 (10) 
H 
and its variance will be 
Var () = — Diner + fas — BOY, (11) 
where for convenience 
G = > ipa; (12) 
H = > ip. (13) 


The derivation of Equations 10 and 11 is simple in the light of 
certain well-known principles of sampling. Let each sampling unit 
possess 8 cells , each one NR or R (NR for no response, R for response) 
according to the following distribution: 


Class0, 8 NR, OR 
Class1, 7NR, 1R 
Class 2, 6NR, 2R 
Class 4, 4NR, 4R 
Class 6, 2NR, 6R 
Class 8, ONR, 8R 
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Now when we draw a sample, we in effect draw first a sampling unit, 
which will belong to one of the above classes. Next, we draw 1 of its 
8 cells at random to determine whether we get a response. If we draw 
an R-cell (a response), we write down the number 2;;, which will be a 
random variable, the same for all the R-cells of an individual, but vary- 
ing from one individual to another. If we draw an NR-cell (no response), 
we make no record at all. The probability of getting a response in the 
double drawing (first, an individual sampling unit; second, a cell) is 
mpi, Which is only the expected proportion of all the responses that 
will fall in Class ¢. 

The mean of the entire set of responses in the frame will be 


> WiPia; 





G 
i Eade ae 14 
ay vee | (14) 
and their variance will be? 
ales — 2 
e _ pm Tipiloi + (a; LR) ] ” (15) 


>» Wipi 


The double drawing is a random procedure in which each cell has 
the same probability as any other in the entire frame. The mean of 
the returns of a sample will therefore give an unbiased estimate of 
the mean of the entire set of responses; but this is only a restatement 
of Equation 10. The expected number of responses in a sample of n is 
n >_ mpi, wherefore the variance of a sample of n will be very closely 
equal to o?/n > mp; but this is only a restatement of Equation 11. 
And thus Equations 10 and 11 are established.*® 

The bias in the expected result H(I) of Attempt I will be defined as 


B(I) = E(I) -— a*. (16) 
The mean square error of x(I) will then be 
Mse (I) = Var (I) + B(I). (17) 


If Figure 1 were drawn for Attempt I, the two terms on the right of this 
equation would be the squares of the two legs of the triangle, and the 
left-hand member would be the square of the hypotenuse. 





1? This is the formula for the variance of a composite universe; see, for example, the author’s Some 
Theory of Sampling (John Wiley & Sons. 1950), pp. 58 and 59. 

8 My colleague Dr. Benjamin J. Tepping discovered this simple way of deriving Equations 10 
and 11. He furnished also algebraic proofs, but they seem not to be required. 
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ATTEMPTS I, III, IV 


The nonresponses left over from the first attempt form a new frame. 
The sampling plan may prescribe 0, 1, 2, or more recalls on a sample of 
these nonresponses. 

The Ist recall will be identified here as Attempt II. The 2d and 3d 
recalls will be Attempts III and IV. 

The determination of the optimum fraction (y) of the nonresponses 
of Attempt I to draw for recall will be a subject for investigation in a 
later paragraph. 

The bias of nonresponse arises from Classes 1, 2, 4, 6. Each successive 
attempt digs deeper into the lower classes, and diminishes the relative 
proportions that remain in the upper classes. Class 8 is in fact wiped 
out in Attempt I. In this way the combination of successive attempts 
pushes the accumulated result closer and closer to the patient mean a’. 

We assume that each attempt picks up a random sample of the non- 
responses in each class. This is not what happens, but it is probably 
impossible to put down an equation for what actually happens. The 
interviewers use ingenuity. They find out from neighbors when the 
people now absent will be at home. They make observations: they make 
appointments. They hold conferences to decide which one of them 
might best succeed in breaking down a refusal. Working for and working 
against the interviewers is some softening and also some hardening 
of the hearts of people who refused at an earlier call. I have seen them 
both. The net result is probably that the recalls are less costly (as 
Houseman says) than I assume in Table 3, and more successful than 
this theory indicates. If so, then the recommendations for recalls are 
even stronger than one may conclude from this theory alone. 

Equations 10 and 11 apply also for the results of Attempts II, 
III, IV, if n is treated in any attempt as the number of interviews at- 
tempted, and if p; in Equations 10-13 is replaced by: 


(1—m:) pi/ dD, (1 — mpi Attempt II 

(1-7)? pi/ >> (1 — mi)2px Attempt III 

(1—m.)* pi/ D (1 — ,)*p;. Attempt IV 
Class 8 contributes nothing to these sums, being wiped out by the fac- 
tor 1—2; which is 0 when 7=8. 


EQUATIONS FOR THE COMBINATION OF ATTEMPTS 


If the plan of survey calls for two recalls, we combine the results of 
Attempts I, II, III. With an obvious extension of notation, the result 
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of this combination will be 
a(I + IT + IIL) = uwer(I) + wne(ID) + ume(II), (18) 
where U1, Un, Unt are weights. If Ri, Ru, Rm are the responses in the 
three separate attempts, then 
Ri, Ru, Ru 

Ri + Ru + Rin 
For the expected value of z(I+II+III) we may write with sufficient 
approximation 

E(I + 11 + Ill) = w:4(1) + wn E (II) + wm E (IIT), = (20) 


wherein wi, Wi, Wo, are the expectations of uz, uz, Um. Formally, with 
sufficient approximation, 


> tps, D ipe(l — mi), Do ipl — ms)? 
> ipi[t + (lL — ms) + (1 — #1)? 
Before proceeding, we note that 
uy + Un + Unt = i} 
wr + wn + wir = 1)” 
The bias of x(I+II+III) of the combined results of Attempts I, IT, 
III will be defined as 
BI+I11+ 0D = #1+ 11+ 11 — a. (23) 
The variance of x(I+II+III) may be computed as 
Var (I + II + III) = w;? Var (I) + wr? Var (II) 
+ wii Var (IIT). (24) 
The notation in the above equations can easily be extended or con- 
tracted to more or to fewer attempts. For a plan that uses only one re- 
call, we simply drop the symbol III; also the term (1—7;)? in Equa- 
tion 21. For a plan that uses three recalls, we annex a term in IV, and 
replace (1—2;)? by (1—7,)?+(1—7;)*. 
THE POLITZ PLAN 


The Politz plan includes questions to inquire of each person found 
at home, and who does not refuse, to ascertain whether he was at home 
last night at this time, the night before last, etc., to cover the 5 nights 
preceding the interview, 6 nights in all. Each return is given a weight 


(19) 





Uy, Ui, UI = 





(21) 


Wy, Wil, Wir = 


(22) 
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w,, the reciprocal of the number of nights at home over the period of § 
successive nights. The result of applying the Politz plan will be the 
random variable 


Sw.R ir, 
Sw.R: 


wherein S denotes the sum over the 6 Politz classes, and wherein R, 
and zx; denote the number of responses and their mean value in the 
Politz class ¢. w,=6/(1+¢), where ¢ is the number of nights at home 
during the preceding 5 nights. w;, R;, and 2; are, all random variables, 

In each class except Class 8 it is possible for a person to be at home, 
during the preceding 5 nights, some number of nights other than his 
average (7;). Thus, E w, is not the reciprocal of r;, but takes the values 
shown in Equation 29. By applying the formula 


E 
v Ev 


2(P) = 





(25) 


it is possible to find the expected value of x(P) and to show that the 
Politz correction for not being at home leads to the bias 


B(P) = Ex(P) — a* [Definition | 


= ap — at — {(1- —) > (rop*(B - Ada - ar) 


oe 





1 
nV? pe rip:B (a; — ar) ‘ (27) 


The terms in the braces are very small numerically, and we accept 
with sufficient approximation for our purpose, 


B(P) = ap — a* (28) 
wherein 7;=7/8, as heretofore, and 
6 
A; = Ew, = E —— [For Class 7] 
1+¢ 
: 6 5 Assuming that ¢ is a 
=>) —( ya = #)‘s? binomial variate | 
t=0 1 + t t 
12.76 
= ( )a a 1 ;)o-*x! 
Ti gm 8 


= pons (1 = (1 = wi)*|, (29) 


7 
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6 2 
Ew? = z(—) 
1+t 





B; = 

EEC) 

LE (eam t(Oe-n 
toto, (30) 
ESw;:R 2: 

ap = “ESwiRy 
et - Ft a 
V = > wipiAi. (32) 


The bias of a plan that uses kK—1 recalls may be written 


> pill — (1 — 2)* Ja: 
> pili — (1 — m)*] 


to the same approximation that appears in Equation 28. With k=2, 
for example, this form gives a numerical verification of the bias 
B(I+II+III) calculated otherwise by Equations 20 and 23. 

The variance of the Politz plan is® 


(33) 





B(I — K) = 


1 
Var (P) = nV? ) rip:B;[o? + (a; — ap) 2] 


1\ l 
+ (1 sa -)—= (rp)(B; — AP)(a:— ar). (84) 


It is worth noting that if we place A;= B;=1, the second term vanishes, 
and the right-hand member reduces precisely to Equation 15, as it 
should. 





® My equation for the Polits plan differs from the equations given by Polits and Simmons. 
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SEARCH FOR THE OPTIMUM PLAN 


The accumulated mean square error of Attempts I, II, and III will 
be 


M(I + II + III) = Var (I + II + III) + BAI + If + III). (85) 


We drop the symbol III for a plan that calls for two attempts, and we 
annex IV for a plan that calls for four attempts. 

Any two plans may be expected to incur different costs and to yield 
different mean square errors. As agreed at the beginning, a plan is 
optimum if its cost is less than that of any other plan that will yield 
the same mean square error. This is a matter of numerical calculation. 

Numerical assignments to the various fundamental magnitudes (p;, 
a;, o;) will occur two sections ahead. 

We have one other task—the determination of the optimum frac- 
tion y, a subject for the next section. 


DETERMINATION OF THE OPTIMUM FRACTION OF NONRESPONSES 
TO INCLUDE IN THE RECALLS 


Let y denote the fraction sought. We remind the reader that At- 
tempt III will be a canvass of all the nonresponses that remain from 
Attempt II, and that Attempt IV will be a canvass of all the non- 
responses that remain from Attempt III. There is thus only the one 
fraction y to determine. 

The mean square error (M) of the accumulated result of any num- 
ber of attempts may be written in terms of n and y as 


M=A+B/n+C/ny, (36) 
the cost of which is 
Y = Dn + Eny, (37) 


A, B, C, D, E are constants. As before, n is the initial sample for At- 

tempt I. By differentiation it can be shown that, for a fixed value of Y, 
the minimum in M occurs when 

CD 

2 = —. 38 

= (38) 

This result is independent of n, hence it holds for any initial size of 

sample. 

The equation for y* just given contains D and E only in the ratio 

D:E, which shows that y does not depend directly on the absolute 

magnitudes to be assumed for the costs in Attempt I and later, but 
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rather on the ratio of these costs. And as y will be proportional to 
VD:E, y is relatively insensitive to the ratio assumed for D: E. More- 
over, y is not dependent on the absolute magnitudes of the a;, but on 
their ratio to any one of them, or to a*, because B and C occur only in 
the ratio B:C. 

Table 4 shows the optimum values of y obtained from Equation 38; 
also the values selected for actual use in the calculations. The fraction 
y obviously varies slowly with the number of recalls. To simplify the 
required calculations I have set y=3/5 for all plans with the first set 
of a;; and y= 1/4 for all plans with the second set of aj. 

It may be of interest to note that the removal of the bias of non- 
response by recalls is independent of the fraction y. It is not necessary 
to recall on the optimum fraction, nor on any other particular frac- 
tion, so far as the bias of nonresponse is concerned. However, as y de- 
creases, the cost goes down but the variance and the r-m-s error in- 
crease, 80 it is wise not to make y tvo small. The optimum fraction, if 
it can be predicted on experience, or some approximation thereto, will 
guide one close to the minimum r-m-s error for any permissible cost of 
interviewing. 

NUMERICAL MAGNITUDES ASSUMED 


In order to make numerical calculations and to derive conclusions 
therefrom with respect to the most economical design of surveys, it is 
necessary to assume some numerical magnitudes for the p;, o:; also for 
the costs. Unfortunately, no set of numerical magnitudes can be typi- 
cal of all conditions met in the field. I may interject the reminder that 
every question on a questionnaire has not only its own particular val- 
ues of a; and of o;, but of p; as well, even within the same survey, be- 
cause some questions receive better cooperation than others. The best 
that one can do is to make numerical assumptions that fit some of the 
conditions met in practice, and to infer from the equations the range 
of validity of the conclusions. 

The basic numerical assumptions are in Table 1. The expected num- 
ber of interviews, of responses, and of nonresponses, are shown in Ta- 
ble 2. The response rates (the p;:) assumed here are intended to assimi- 
late average urban experience on a question of moderate difficulty; and 
without making them responsible for the final choice, I wish to thank 
Messrs. Lester R. Frankel and Robert Weller of the Alfred Politz Re- 
search organization for their help and interest in choosing these par- 
ticular values. 

Fortunately, there is a great deal more generality in the two sets of 
a; than may be apparent at first sight, for one may transform either one 
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TABLE 1 


NUMERICAL VALUES ASSUMED 
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Cl 
Property and _ 
symbol o|/1 2 4 6 8 1-8 
Proportion p; .05 10 .10 .20 .25 ~~ = .30 95 





Mean value 


of the a; (1st) | xxx 
measured 

character- ja; (2d) | xxx 
istic J 


2.00 1.75 1.50 1.25 1.00 


10 .20 .40 .60 1.00 


a* =1.355 263 


a* =0.589 474 





Standard deviation 
o% 


XXX 


Same as a; in both sets of a; 




















TABLE 2 


THE EXPECTED SIZES OF SAMPLE IN THE VARIOUS AT- 
TEMPTS, BASED ON AN INITIAL SAMPLE OF n IN 
ATTEMPT I. HERE THE SUMS RUN OVER 
ALL CLASSES, 0 TO 8 











Attempt Interviews Responses Nonresponses 
I n n>. mips n>,(1 -_ i) Pi 
II nu =ny 2) (1-7) ps ny 2 (1 — i) mpi ny 2 (1 —2s)2p 
III nm =ny > (1 —7)?p; ny > (1 —m4)?2ipi ny (1 —7:i)*p; 
IV ny =ny >, (1—2)*p; ny Ya — x) * api ny mre — i) *pi 
Vv ny =ny D)(1—2i)*pi ny D(1— i) *eips ny 2 (1 —2:)5p; 
VI ny =ny (1 —7i)*p; ny we — ai) api ny (1 — mi) *p; 
VII nyll =ny wre! —7;i)*p; ny we — xj) 8xypi ny wre! — xi)" pi 











Numerical values based o 


n an initial sample of n = 1000 





III 
IV 


VI 
VII 





n =1000 
nu =375 .Oy 
ni = 248 .4y 
my =188.ly 
ny =153.7y 
nyt =131.5y 
nyn =115.9y 








625.0 375.0 
126 .6y 248 .4y 
60 .3y 188.1ly 
34.4y 153.7y 
22.2y 131.5y 
15.6y 115.9y 
11.7y 104.2y 








= «<« &S. & 





ER i983 


5 263 


9 474 
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of these sets into almost any other that he may encounter. For ex- 
ample, to discuss a yes-and-no survey in which the proportions of yes 
vary from 60% in Class 1 to 40% in Class 8, one has only to derive a 
new value a,’ from an old a; by setting 


a;’ = 20 + 20a; (39) 


where a; on the right belongs to the 1st set of a; in Table 1. Both a; 
and a;’—20 have a 2-fold variation from Class 1 to Class 8. The new 
patient mean is 


a’* = 20 + 20a* (40) 


where a* = 1.355 263, the patient mean of the Ist set of a;, as given in 
Table 1. The relative bias computed for a;’—20, for any number of at- 
tempts, will be precisely the same as the relative bias computed for a; 
(Table 5). It follows that the new expected value for any number of 
attempts will be 


E’ = 20a* Rel B + a’* 
= 47.105 + 27.105 Rel B (41) 


where Rel B is the relative bias shown in Table 5 for the corresponding 
number of attempts. An example will occur later (Table 9). 

The 2d set of a; could serve the same purpose by a suitable transfor- 
mation, but we shall not carry it through. 

Thus, in spite of the limitations of any particular set of numerical 
assumptions, the conclusions to be drawn will warrant some sweeping 
generalizations. 

COSTS 


For the costs of making calls (interviewing only) we assume for cal- 
culation the following figures: 


For Attempt I, $3 per call 
For later attempts, $5 per call 
For the Politz plan, $4 per name 


This amount will cover the cost of weighting 
and of calling back on the temporary refusals. 


Table 3 shows the costs of interviewing derived from the values as- 
sumed for the p; in Table 1, and with the cost per call as mentioned 
earlier. n is the size of the initial sample, and y is the fraction of the non- 
responses left over from Attempt I that constitute the sample for At- 
tempt IT. 
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TABLE 3 
COSTS OF INTERVIEWING 








Plan No. of recalls Cost (dollars) 





Y =3n 
3n +1.8750ny 
3n +3 .1172ny 
3n +4.0576ny 
3n +4 .8263ny 


Attempt I 0 
Attempts I+II .3750ny 
Attempts I—{II -6234ny 
Attempts I-IV .8115ny 
Attempts I-V .9653ny 
Attempts I—VI 1.0968ny 3n +5 .4839ny 
Attempts I—VII 1.2126ny 3n +6 .0632ny 
Politz (equivalent to 5 recalls) 4n 





The actual numerical magnitudes of these costs are not so important 
as their relative magnitudes. If all the costs were doubled, the cost 
computed for any plan will be doubled, but the relative costs and the 
relative merits of the various plans would remain unchanged. 


TABLE 4 


RESULTS FOR THE OPTIMUM y, AND THE VALUES 
SELECTED FOR THE CALCULATIONS THAT LED 
TO TABLES 5 AND 6, AND TO FIGS. 2 AND 3 








1st set of a; 2d set of a; 





y calculated 
from 
Equation 38 


y selected 
for 
calculation 


y calculated 
from 
Equation 38 


y selected 
for 
calculation 





I—II 
I-III 
I-IV 
I-V 
I-VI 
I-—VII 





.69 
.67 
65 
.63 
-61 
.60 








33 








It should be noted that these costs are for the interviewing only. 
Considerations of overhead costs, training, and office-work for the 
different plans must be taken into account before one decides definitely 
whether one plan is more economical than another. 


CONCLUSIONS FROM THE CALCULATIONS 


The numerical results of the calculations are in Tables 5, 6, 7, 8 and 
in Figs. 2 and 3, The biases and r-m-s errors are expressed in units of 
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TABLE 5 


NUMERICAL VALUES OF THE BIASES AND R-M-S ERRORS FOR 
VARIOUS SIZES OF INITIAL SAMPLE (n); 1st SET OF ai, 
y=.6. COSTS AT n=1000 








Plan I I+II I-III I-V 





Rel bias —.110874 -075752 | —.057302 — .036800 


n=100 
Rel m-s-e .025974 -019918 -017628 r -015938 
Rel r-m-s-e - 161164 .141131 - 132773 ‘ - 126246 


n=200 
Rel m-s-e .019133 .012828 .010456 F -008646 
Rel r-m-s-e . 138322 . 113261 - 102255 j .092984 


n=300 
Rel m-s-e -016853 .010464 -008065 dl -006215 
Rel r-m-s-e . 129822 . 102294 .089805 d -078834 


n=500 
Rel m-s-e -015029 -008574 -006153 
Rel r-m-s-e - 122593 -092596 -078441 


n=1000 
Rel m-s-e -013661 .007156 .004718 
Rel r-m-s-e - 116880 .084593 -068688 


n=2000 
Rel m-8-e .012977 -006447 -004001 
Rel r-m-s-e - 113920 .080293 


n=3000 
Rel m-s-e -012749 -006211 
Rel r-m-s-e -112911 .078810 


n=5000 
Rel m-s-e -012567 -006022 
Rel r-m-s-e -112103 





Costs at 
n=1000 $3000 


























a*, The base for the bias is the 0-point of the scale for the a;. The esti- 
mation is assumed to be a summation of the initial call and the recalls. 
The aim is assumed to be the estimation of an average or of a total. 


A. Conclusions from the 1st set of a;, a 2-fold variation from a, to ag: 
Table 5 and Fig. 2. Conclusions 1, 2, 3, 4, and 5b are independent of the 
type and size of sample. 


1. With no recalls at all (Attempt I only), the minimum relative 
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r-m-s error attainable is 11%. No sample however big, not even a com- 
plete count, can penetrate below this minimum, without recalls. 


30% [ 





Rmse (1) 
-B(I) 





RELATIVE ERROR 


Rese (I *1l) 
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FicureE 2. The relative bias, the relativer-m-s error, and thecost, plotted against 
the initial sample-size (n) for various plans, for the Ist set of a;, in which a,=2 as. 
The curves show the futility of attempting to achieve accuracy by sheer size of 
sample. Recalls are much more effective. The dashed lines show the size of 
sample required, and the cost, to yield a relative r-m-s error of 73%. The relative 
biases and the relative r-m-s errors are in units of a*. 


2. With one recall (Attempts I+II), the minimum r-m-s error drops 
to 7.6%. No sample however big can penetrate below this minimum 
with only one recall. 
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3. With 2 recalls (Attempts I+II+III), the minimum r-m-s error 
drops to 5.7%. No sample however big can penetrate below this mini- 
mum with only two recalls. 

4. With 3 recalls (Attempts I-IV), the minimum r-m-s error drops 
to 4.5%. With 4, 5, and 6 recalls, the minimum r-m-s error drops to 3.7, 
3.0, and 2.5%. 

5. To attain a prescribed r-m-s error of (e.g.) 73%: 


(a) We may use 3, 4, 5, or 6 recalls with initial samples as shown in the ac- 
companying table. 


From Fie. 2 








No. of recalls Initial sample Cost 





345 $2,290 
378 2,390 
408 2,450 
512 2,800 





(b) With 0, 1, or 2 recalls we can not attain the prescribed r-m-s error (73%) 
with any sample however big. 


B. Conclusions from the 2d set of a:, a 10-fold variation from a, to as: 
Table 6 and Fig. 3. Conclusions 6, 7, 8, 9, and 10b are independent of the 
type and size of sample. 


6. With no reealls at all (Attempt I only), the minimum r-m-s error 
attainable is 24.5%. No sample however big, not even a complete 
count, can penetrate below this minimum without recalls. 

7. With one recall (Attempt I+II), the minimum r-m-s error drops 
to 15.5%. No sample however big can penetrate below this minimum 
with only one recall. 

8. With 2 recalls (Attempt I+II-+III), the minimum r-m-s error 
drops to 11.3%. No sample however big can penetrate below this mini- 
mum with only two recalls. 

9. With 3 recalls (Attempts I-IV), the minimum r-m-s error drops 
to 8.7%. With 4, 5, and 6 recalls, the minimum r-m-s error drops to 
6.9, 5.6, and 4.7%. 

10. To attain a prescribed r-m-s error of (e.g.) 10%: 


(a) We may use 3, 4, 5, or 6 recalls with initial samples as shown in the ac- 
companying table. 
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From Fria. 3 
No. of recalls Initial sample Cost 
6 210 $ 875 
5 245 1,010 
4 325 1,380 
3 730 2,910 





(b) With 0, 1, or 2 recalls we can not achieve the prescribed r-m-s error (10%) 
with any sample however big. 


TABLE 6 


NUMERICAL VALUES OF THE BIASES AND R-M-S ERRORS FOR 
VARIOUS SIZES OF INITIAL SAMPLE (n); 2d SET OF aj, 
y =.25. COSTS AT n=1000 














Plan I I+II I-III I-IV I-vV I-VI I-VII 

Rel bias .245190 - 155062 . 112665 -086955 -069408 -056593 -046815 
n=100 

Rel m-s-e .091985 -046455 -032012 .025385 .021763 .019563 -018135 

Rel r-m-s-e .303290 - 215534 - 178919 - 159327 . 147523 - 139868 - 134666 
n=200 

Rel m-s-e -076052 -035250 -022353 .016473 -013290 -011383 -010164 

Rel r-m-s-e . 275775 - 187750 - 149509 . 128347 - 115282 - 106691 - 100817 
n=300 

Rel m-s-e .070741 -031515 -019133 -013502 -010466 -008656 .007507 

Rel r-m-s-e - 265972 -177525 - 138322 - 116198 - 102303 -093038 -086643 
n =500 

Rel m-s-e -066492 -028527 -016558 -011125 -008207 -006475 -005381 

Rel r-m-s-e - 257860 - 168899 - 128678 - 105475 -090592 -080467 -073355 
n=1000 

Rel m-s-e -063306 -026286 -014626 -009343 -006512 -004839 -003787 

Rel r-m-s-e - 251607 - 162130 - 120938 -096659 - 080697 -069563 -061539 
n = 2000 

Rel m-s-e -061712 -025166 -013660 -008451 -005665 .004021 -002990 

Rel r-m-s-e - 248419 - 158638 - 116876 -091929 -075266 -063411 -054681 
n =3000 

Rel m-s-e -061181 -024792 -013338 -008154 -005383 .003748 -002724 

Rel r-m-s-e . 247348 - 157455 - 115490 -090299 .073369 -061221 .052192 
n =5000 

Rel m-s-e -060756 .024493 -013080 -007917 -005157 -003530 -002512 

Rel r-m-s-e . 246487 - 156502 - 114368 -088978 -071812 . 059414 -050120 
Costs at 

n=1000 $3000 3469 3779 4014 4207 4371 4516 
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a, SIZE OF INITIAL SAMPLE 


Fiaure 3. The relative bias, the relative r-m-s error, and the cost, plotted 
against the initial sample-size (n) for various plans, for the 2d set of a, in which 
a;=.1 as. The curves show the futility of attempting to achieve accuracy by 
sheer size of sample. Recalls are much more effective. The dashed lines show the 
size of sample required, and the cost, to yield a relative r-m-s error of 10%. 
The relative biases and the relative r-m-s errors are in units of a*. 


C. General conclusions 


11. Even with three recalls, with the level of response assumed in the 
calculations (taken from average urban experience), a sample bigger 
than the binomial equivalent of from 300 to 500 for an estimate of any 
one class is ineffective and uneconomical. A plan that would reap any 
real benefit from bigger samples must support 4 or 5 or more recalls. 
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12. An attempted “complete count” is no exception, and often rep- 
resents an extreme waste of effort. 

13. With the proportions of nonresponse assumed here, high ac- 
curacy can be attained only with 4, 5, or 6 recalls, along with an initial 
sample equivalent to from 800 to 1500 binomial cases. Careful con- 
sideration should therefore be given in the planning to decide whether 
the need for extreme accuracy warrants the required expense and delay 
occasioned by recalls beyond the 3d, and for an initial sample bigger 
than the binomial equivalent of n =300 in any subclass of the universe 
for which an estimate is desired. 

14. Table 8 shows that where extremely high accuracy is required, 
the Politz plan with 2000 or more binomial cases becomes competitive 
in cost with a survey that depends on recalls. In any case, the Politz 
plan has the advantage of speed, and of being able to produce results 
under circumstances wherein recalls are impossible. 

15. Because one kind of experience may be translated into another 
by transformations similar to Equation 39, the generality of the above 
conclusions and their impact on the design and interpretation of sur- 
veys and of complete counts are inescapable. A limiting case of excep- 
tion occurs, of course, when the range of variation of the a; is small 
compared with a*, 

16. The above conclusions with respect to the number of recalls re- 
quired are generally applicable to all types of sample-design for draw- 
ing the sampling units. A change in sample-design (as from the bi- 
nomial sampling of individuals to samples of areas) only changes (usu- 
ally widens) the distance from the bias to the r-m-s error in Figs. 2 and 
3, without raising or lowering the bias. The most economical number (n) 
of interviews in an area sample, for any given number of recalls, will 
for most characteristics be bigger than the figures mentioned in con- 
clusions 11, 13, and 14. The increase may range from 0 on up to some- 
times double, depending on the characteristic and the clustering effect 
of the interviewers’ workloads. 


IMPACT ON DESIGN 


The most impressive feature of the results is the heavy bias of non- 
response, when no provision is made to reduce it, even though there 
be but a 2-fold variation from a; to ag. 

The second most impressive feature is the fact that if nonresponse 
reaches anywhere near the proportions (p;) assumed, then when the 0 
of the scale of the a; is not large, we can not afford, except for special 
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justification, to plan for extreme accuracy: it is simply too expensive. 

This conclusion is also borne out by Table 7, which shows that more 
information per dollar comes from a sample of 500 than for a sample 
of 1000; and that every successive recall shows a gain in the amount of 
information obtained per dollar, particularly for the smaller sample 
size. An optimum is not reached even with six recalls. In other words, 
as we concluded earlier from Figs. 2 and 3 and from Tables 5 and 6, we 
get more for our money by taking a moderate initial sample and dig- 
ging deep into it with many recalls. However, many recalls delay the 
day on which the tabulations will be ready, and one may be forced to 


TABLE 7 


THE AMOUNT OF INFORMATION PER UNIT COST FOR THE SEVEN 

PLANS (FROM 0 TO 6 RECALLS), FOR INITIAL SAMPLES OF 500 AND 

1000. INFORMATION IS DEFINED AS THE RECIPROCAL OF THE 
REL M-S-EIN TABLES5 AND 6. THE COST COMES FROM TABLE 3 























Ist set of a; 2d set of a; 

Plan 

n =500 n =1000 n =500 n =1000 
I .044 358 .024 400 .005 013 .005 265 
I+II .056 562 .033 877 .020 216 .010 966 
I-III .066 744 .043 522 .031 954 .018 092 
I-IV .074 326 .052 524 .044 787 .026 664 
I-V .079 422 .060 315 .057 912 .036 501 
I-VI .082 438 .066 519 .070 649 .047 278 
I-VII .083 856 .071 127 .082 302 .058 472 








call a halt at 3 or 4 recalls. Where speed is urgent, or where recalls are 
otherwise inadvisable, one may bear in mind the Politz plan, which of- 
fers a rapid solution with recalls only on the temporary refusals. 
With the usual method of estimation (pooling the initial call and the 
recalls) the best way to attain accuracy is to build up the initial re- 
sponse (i.e., to increase ps). One or two recalls would then be much 
more effective than they are under the conditions assumed; and bigger 
samples would also be more effective. Observations on the proper time 
of day to find certain kinds of people at home in a particular area, and 
willing to answer questions, plus a skillful introduction and approach 
so as to cut down refusals, are known to be helpful in this direction. 
An attempted complete count is no exception to the conclusions 
reached. Without a highly successful initial response, followed by some 
effective number of recalls, 95% of the energy put into a complete 
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count, taken to obtain an estimate for a large area, may be wasted. 
Size does not atone for nonresponse: this is all too evident from the 
calculations (Tables 5 and 6; Figs. 2 and 3). 

The mechanism adopted here is a device by which experience can 
be accumulated and pointed toward the attainment of (a) greater ac- 
curacy per unit cost, and (b) less waste, through conservation of un- 
productive effort expended on samples that are too big. Good guesses 
for the constants p:, a:, 0; can almost always be made on the basis of 
past experience; and the calculations made with them will indicate a 
plan not far from optimum. Continued experience will provide im- 
proved numerical values for the constants, and continually improved 
design and interpretation of the results. Without a probability design 
of some sort, it is difficult to capitalize on experience. 

Although the discourse here has been entirely in terms of interviews, 
the results are equally applicable to surveys in which the initial at- 
tempt is made by mail, or in which all attempts are made by mail. 
Appropriate changes must of course be made in the numerical values of 
the constants. Thus, if the mail were used for Attempt I, and if inter- 
views were used for the recalls, then the cost D in Equation 38 would 
be much less than it is when interviews are used in Attempt I, and y 
will then be smaller. For example, if the cost of a mailed questionnaire 
were $.75, and if the cost of an interview on a nonresponse were $5, 
then y would reduce to perhaps as low a figure as 1 in 6, depending of 
course on the other constants in the equation. 

One may well wonder what the biases are in surveys that depend 
only on a mailed survey with a 15% total response, or even 30% or 
50%, without calls on the nonresponses. The mechanism adopted here 
shows that it is a mystery how such results can be worth anything at 
all. 


IMPACT ON METHODS OF ESTIMATION 


After the returns from the survey are in, there remains the problem 
of estimating the mean per sampling unit, and the standard error of 
this estimate. As the survey does not touch Class 0, it can by itself 
only produce estimates for Classes 1-6. 

The usual practice of combining the various attempts (after weight- 
ing Attempt ITI and higher attempts by the factor 1/y) may be both 
misleading and inefficient. A glance at Table 5 or at Figure 2 shows that 
41% of the bias still remains after the 3d recall, and that 27% still 
remains after the 5th. Table 6 and Figure 3 are equally discouraging. 
The decreasingly slow ascent toward the vertex of 0 bias may explain 
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how easy it is to conclude, incorrectly, that after 3 recalls there is little 
more bias to squeeze out, and that additional recalls are not worth their 
cost. 

To illustrate the usual procedure, let us make some calculations on 
a yes and no survey.!° The proportions of yes in the various classes will 
range, let us suppose, from 60% in Class 1 down to 40% in Class 8, 
following the relative values of 20(a;+1) derived from the Ist set of a; 
in Table 1. Table 9, calculated with the aid of Equation 41, shows the 
expected results of combining 2 attempts, 3 attempts, etc. The result 
that we really need is the patient mean, shown at the bottom of the ta- 
ble as the expected result of continuing the recalls indefinitely. The 
siow progress of the combined result is obvious; also the need of some- 
thing better. 


TABLE 9 


THE EXPECTED PROPORTIONS OF YES, FOR SEVERAL PLANS, 
COMPUTED BY EQUATION 41. THE PROPORTIONS OF YES 
RANGE FROM 60% IN CLASS 1 TO 40% IN CLASS 8 























“a Expected proportion Bias 
Yeo No remaining 
Attempt I 44.10 55.90 100.0% 
I+II 45.05 54.95 68.4 
I+II+III 45.55 54.45 51.8 
I-IV 45.88 54.12 40.9 
I- V 46.11 53.89 33.2 
I-VI 46 .28 53.72 27.6 
I-VII 46.42 53.58 22.8 
Infinity a’* =47.11 52.89 0 








What we need is a way to extract more information from the recalls. 
A more efficient estimate may be contained in a scheme for extrapolat- 
ing the results of the various attempts, as proposed by Hendricks." 
The mechanism proposed here will provide a rational scale for the 
extrapolation. It may be that the scale proposed by Hendricks is ap- 





10 J am indebted to Dr. Leo P. Crespi and to Mr. Fred W. Trembour of the Reactions Analysis 
Staff in the Office of the High Commissioner for Germany, who in several conversations with the author 
brought up questions and suggestions that led to this illustration. 

1 Walter A. Hendricks, Chapter 5 in the book Agricultural Estimating and Reporting Service (Miscel- 
laneous Publications No. 703, Bureau of Agricultural Economics, Washington, 1949); pages 31-35 in 
particular. 
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propriate, or it may be that some other scale will give more accurate 
results with convenience. 

For an estimate of a* by extrapolation, we may look upon recalls as 
necessary to provide the required coordinates of points by which to 
make the extrapolation, and not merely to provide additional returns 
to add to the initial attempt. 

For this new type of estimate, the standard error would not be cal- 
culated in the usual way (Equation 24), but as the standard error 
of the intercept on the scale along which we read, by extrapolation, the 
estimate of a*. New theory will be required for the optimum allocation 
of effort amongst the various recalls, and for effecting the extrapola- 
tion; also for calculating its standard error. It may turn out, for ex- 
ample, that unless one can achieve extremely high initial response, 
approaching 90%, there may be little point in expending funds to build 
it up. It is possible that theory beyond the scope of this paper may lead 
to efficiency and reliability far beyond those attained in practice today. 


SOME REMARKS ON CLASS 0 


We must face the fact that our survey can at best only provide esti- 
mates for Classes 1-6, although it can also give us the proportion po 
and some of the characteristics of Class 0. The administrative decisions 
that the survey was expected to help may nevertheless involve Class 0 
along with the others. In a marketing study, for example, the people in 
this class may be heavy purchasers of the very commodity that forms 
the subject of the survey. They may in part be people who travel much, 
and who may thus be important to a railway, an air line, a manufac- 
turer of automobiles, a hotel, and to others. They may be people in 
high income groups. It may therefore be important to learn how much 
we are missing by not bringing them into the survey. 

Unfortunately, it is impossible to learn this magnitude from the sur- 
vey itself. The only possible approach seems to be from outside sources, 
such as through statistics on the total movement of a particular product 
from wholesale into retail stores. It is possible in many cases to gather 
outside evidence by which to evaluate approximately the magnitude 
of ao (the mean in Class 0), or rather of the total aopo in Class 0, for 
some of the important characteristics that affect the decisions or relate 
to them. The next step is to ascribe upper and lower bounds to the pos- 
sible magnitude of aopo, and thus to infer the possible effects of Class 0 
on the uses and limitations of the data.” 





2 This suggestion came from Professor Philip M. Hauser in an informal conversation in regard to 
this research. 
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The difficulty with Class 0 is not peculiarly a sampling problem, as 
Class 0 appears in complete counts as well as in samples—in fact, it is 
undoubtedly bigger in complete counts than in samples. 
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EFFECT OF WEIGHTING BY CARD-DUPLICATION 
ON THE EFFICIENCY OF SURVEY RESULTS 


Invinc RosHWALB 
Opinion Research Corporation 


URVEY designs often specify that weights be applied to certain 

groups of observations as part of the estimation procedures. These 
estimation procedures are designed as complements of sampling in- 
structions which may, for example, specify methods for eliminating 
call-backs,' or the disproportionate allocation of the sample to the sev- 
eral strata.? A third example of the need for weighting may be taken 
from the fact that the denominator of the sampling fraction, 1/k is, fre- 
quently a non-integral divisor of the size of the population to which the 
fraction is applied. This means that the number of cases obtained is 
very often not equal to the number of cases desired, and that some 
weighting adjustments may thus be called for. The subject of this note 
is the specific problem of the effect of non-integral weights (e.g., 1.2, 
1.75, 8.4, etc.) on the sampling variability of the survey results. The 
two weighting procedures described below, arithmetic and card-duplica- 
tion, give identical results when the weights are integral. This identity 
disappears when the weights are non-integral, for card-duplication 
involves sampling the returned questionnaires for reproduction. A dis- 
cussion of the problem for the case of stratified sampling, using dispro- 
portionate allocation, offers a useful solution. 

When a sample design calls for the disproportionate allocation of the 
sample to the several strata, we have seen that there are two alternative 
estimation procedures 

(a) Arithmetic weighting: If W; is the proportion of the population in the 
ith stratum and #; is the sample estimate of the mean in the ith stratum, 
then 2=)_ W,2; is an unbiased estimate of the population mean, y (i =1, 
2, oo, L). 

(b) tortures weighting: If the rerults of the survey can be recorded 
on punch-cards, then it is possible to weight the N; observations in each 
stratum to their proper weight in the sample by drawing a random sample 
of n; cards from the original N; cards so that the total number of cards in 
the stratum, (N;+7;), is equal to Wi(N +n), where (N +n) is the total 
number of cards in all strata, including both the original and the duplicate 


cards. 





1A. Polits, and W. Simmons, “An attempt to get the not-at-homes into the sample without call- 
backs,” Journal of the American Statistical Association, 44 (1949), 9-31. 
2 W. E. Deming, Some Theory of Sampling. New York: John Wiley and Sons, 1950, p. 215. 
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In the case of arithmetic weighting, the variance of Z is known to be 


Var = = >> W?? Var i. (1) 


ta] 


In the case of card-duplication, the variance of the sample estimate 
of the mean may contain an additional component of variation due to 
the sample of n; cards drawn from the original sample N; in the ith 
stratum. Thus, if n;=0, or n;=dN;, where d is an integral number, then 
this additional component is equal to zero. If, however, n:/N;>0 and 
non-integral, this additional component may be greater than zero. Dis- 
regarding the stratal index, we may then seek the value of n (the num- 
ber of cards to duplicate) which maximizes the sampling error of an 
estimate based on the N+ n cards, original and duplicates, and, coin- 
cidentally, examine the effect of card-duplication on the variance of the 
estimate. 

Assume that a random sample of N items be drawn without replace- 
ment from a population of size P and with mean yz and variance o*. 
Draw n cards at random without replacement from the N, duplicate 
these n cards, and estimate y» on the basis of the N +n cards. This last 
estimate is 





m : {2 , i+ yah (2) 


se N + n to] n+1 


and 


Vy 7 o? {1 CS au ae . 
a * I ( n ag . (3) 








Considered as a function of n, 


o P—N 
Varm = —- 
N P-1 





for n=0 and n=N. Var m attains its maximum value when n=WN/3. 
When 


o? (3P — 4n 
n = N/3, Var m = — 4h 
n (8(P — 1) 


In order to study the effect of weighting by card-duplication on the 
variance of the sample mean, we might first compute the relative infor- 
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mation® of this estimate, where the relative information, computed as 
the ratio between Var Z and Var m, is 


= (1 + d)? 
(8d + 1) — (5d? — 2d + 1)r 
where d is the rate of duplication=n/N, and r is the sampling rate 
=N/P. 


EXAMPLES OF THE EFFECT OF CARD-DUPLICATION ON 
THE VARIANCE OF THE SAMPLE MEAN 





(1 — 1) (4) 








Rate of Duplication 














Sampling 
en 20% 334% 50% 663% 
Relative Information* (percent) 
OT 90 .00 88.89 90 .00 92.59 
.O1 §9.55 88.39 89.55 92.26 
.05 87.69 86.36 87 .69 90.82 
.10 85.26 83.72 85.26 88.93 
.50 60.00 57.14 60 .00 67 .57 
-90 16.36 14.81 16.36 21.37 
* Relative Information =< = x100. The Relative Loss of Information due to card-duplication 
may be computed by subtracting the appropriate J value from 109%, i.e., the Relative Loss of Infor- 
mation = 100% —~ 2" x10. 
Var m 





+ A sero sampling rate corresponds tc the case of sampling from an infinite population, or sampling 
with replacement. 


When r=0, the condition of sampling with replacement holds, and 
I=(1+d)?/(3d+1), a function of the rate of duplication only. The 
table exhibits a few examples of the effect of card-duplication on the 
variance of the sample mean, m, for several sampling rates. The figure, 
exhibiting the graph of J as a function of d for various values of r, dem- 
onstrates the slight losses in efficiency due to card-duplication when the 
sampling rate is small. It also points up the shallowness of these curves, 
ie., the insensitiveness of I to changes in n within a broad interval 
about the critical value, n= N/3. 

When card-duplication is applied in the case of a stratified sample, 

LNitni 


~ ee 5 


2 R.A. Fisher, Statistical Methods for Research Workers, Tenth Edition. Edinburgh: Oliver and Boyd, 
1948, Section 55. 
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L IN; + ni? 
Var m = > (-= ) Var m, 
n 





t=] 


~ o;? P;—n; 
=> W? {1 


i=l (Ni; + n;)? i P;-1 








P;- (N; — ni) 
+ (N; — n:) aa § (6) 


where m; is the sample estimate of the mean in the 7th stratum, N; and 
n; are as before, N= >_.Ni, n= Doni, and Wi;=(Ni+n:)/(N+n). Com- 
paring expressions (1) and (6), we can see that Var m—Var Z is 0 for 
Effect of Weighting by Card - Duplication: 
Relative Information of the Weighted Sample 
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Information 
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nj=0 or N;. The least efficient case is the limiting one for which n; 
=N;/3. In this case, 


Ci 


Var m = 1.125 Var? + 0.5), W? 7 
t=] 


The increase in sampling variance for the stratified sample due to 
card-duplication is a resultant of the strata increases. Let the symbol 
for this increase be C”. 

In any practical operation it might be asked whether C? is larger or 
smaller than the (bias)? due to incorrect strata weights that the weight- 
ing procedure is designed to remove. 

If w; are the incorrect strata weights and W; the correct ones, the 
bias due to the use of incorrect weights may be expressed as 


B,? = { D _ Woah (7) 


It would seem then that weighting by card-duplication is useful only 
when B,?>C*. On the other hand, nothing is to be gained and some- 
thing may be lost if B,? $C’. 

This indicates that if the method of card-duplication is used to ob- 
tain estimates from a disproportionate sample, then whatever gains 
that might have been expected from the disproportionality could be 
seriously reduced by this weighting procedure. In other words, the 
gains due to the sample design must be at least large enough to over- 
come the loss of efficiency due to the weighting scheme. 











THE MATHEMATICAL BASIS FOR THE BEAN METHOD oF 
GRAPHIC MULTIPLE CORRELATION* 


Ricuarp J. Foote 
Bureau of Agricultural Economics 


N 1929, Louis H. Bean published an article describing a graphic 

method of multiple correlation which subsequently has been widely 
used, particularly in the field of agricultural economics.' In the late 
1930’s, considerable controversy arose among users of the method. 
This controversy in part concerned the correct interpretation of the 
results obtained in terms of standard mathematical coefficients. To 
clarify these aspects, the writer, with J. Russell Ives, published in 1941 
a@ paper outlining in detail the relationship of the graphic method to 
the mathematical method of least squares.? The mathematical proofs 
in that paper were developed with the assistance of M. A. Girshick, 
then on the staff of the Bureau of Agricultural Economics. The ma- 
terial presented here is based mainly on that given in the 1941 paper, 
but includes certain closely related aspects not published at that time. 
Attention is concentrated on the relationship of the graphic method to 
the mathematical method of least squares. Adequate descriptions of 
the mechanics of the graphic method are available in a number of pub- 
lications.* 

Relationships between the graphic method and the least squares 
method can be explained most effectively in terms of the simpler cases, 
especially three-variable linear multiple regression. In such cases, if X, 
is the dependent variable and X2 and X; are independent variables, 
when 123 is equal to zero, each partial regression coefficient is identical 
with the corresponding simple regression coefficient. When r2;3 is con- 
siderably different from zero, the device of “drift lines” (to be discussed 
later) facilitates estimation of first approximations to by2.3 and bi3.2 which 
are superior to b;2. and b,3. Successive transference of residuals leads to 
lines with slopes approximately equal to the mathematically-calculated 
values of bi2.3 and bi3.2, but the process is slow when rz; is considerably 





* This material was prepared for presentation at the 1952 meetings of the American Statistical 
Association. Due to certain unavoidable complications, the session on graphic correlation was not 
held. As the paper by Foote and Ives referred to in footnote 2 was issued only in mimeographed form 
and is now available only in libraries, it appeared worth while to publish this as a journal paper. 

1 Louis H. Bean, “A simplified method of graphic curvilinear correlation,” Journal of the American 
Statistical Association, 24 (1929), 386-97. A mimeographed publication containing essentially the same 
material was issued by the Bureau of Agricultural Economics. 

2 “The relationship of the method of graphic correlation to least squares,” Statistics and Agriculture 
No. 1, U. 8. Bureau of Agricultural Economics, 1941. (Processed.) 

3 See for example Bean, ep. cit., or Thomsen, Frederick Lundy and Foote, Richard Jay, Agricultural 
Prices, McGraw-Hill Book Co., New York, 1952, pp. 296-310. 
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different from zero. One problem which tends to slow up the speed of 
convergence is the inability of the research worker to draw least 
squares regression lines accurately. There is a common tendency to 
draw them too steep. Another problem is due to the fact that the itera- 
tive process converges more rapidly if measured by the mean square 
residual than if measured in terms of a particular partial regression co- 
efficient. Thus, in difficult cases, the first several rounds of the iterative 
process may yield a good approximation to R:.23 but poor approxima- 
tions to the regression coefficients bi2.3 and 513.2. These points are dis- 
cussed in more detail in the remainder of this paper. 


MATHEMATICAL MEANING OF THE DRIFT LINES USED IN 
THE GRAPHIC ANALYSIS 
If, in the equation 
X1 = bi + beXe + O:X3 + +++ + pXz, (1) 


constant values are assigned to X3,---, Xp», then b:X3+ --- +b,X, 
is equal to some constant that can be combined with the constant b 
to give a new constant K. Equation (1) can then be written as 


Xi = K -t. beXo, (2) 


which is the equation of a straight line having a slope equal to be. Here 
bs, which may be written as bi2.34...p, is the regression of X; on X2 when 


X;, - + +, Xp are constant. 
If two or more observations in a scatter diagram of X; on X2 had the 
same or approximately the same value of X;, ---, X,, then an esti- 


mate of bi2.3...» could be obtained by drawing a best fitting line through 
them. If this process were repeated for several groups of points having 
the same X;3, - - - , Xp values, several lines whose slopes are estimates 
of the same partial regression coefficient, bi2.3...p, would be obtained. 
The process of obtaining estimates of bi2.3...p from the slope of these 
lines is equivalent to breaking the total sample into selected sub- 
samples and obtaining from each of these an independent estimate of 
by.3..-p. Such lines are the “drift lines” used in the graphic method. 

The closeness with which the average of these slopes approximates 
the mathematical partial regression line will depend upon the stability 
of the slopes of the individual drift lines. In general, the amount of 
fluctuation that may be expected in the slopes of the drift lines will 
depend on (1) the number of observations and the extent of variation 
in the X2 values on which each is based and (2) the size of the partial 
correlation between X, and X2 when X; is constant. 
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PROOF THAT THE SIMPLE REGRESSION IN THE FINAL CHART WILL 
EQUAL THE PARTIAL REGRESSION PROVIDED THE PARTIAL 
REGRESSION IN THE FIRST CHART HAS BEEN ESTI- 

MATED CORRECTLY (THREE-VARIABLE PROBLEM) 


As it is not obvious that the simple regression in the second or final 
chart of a three-variable problem will equal the partial regression even 
if the drift lines correctly estimate the partial regression in the first 
chart, a mathematical proof is given. Stated mathematically, the prob- 
lem is as follows: if the deviations from the regression in the first chart 
are considered as a new variable V;, so that 


Vi = Xi — bir sX2, 


and if the first approximation to bi2.3 is equal to its mathematically cal- 
culated value for the sample of data, we wish to show that the simple 
regression between V; and X; is equal to bis.2. 

The following symbols, which may be new to some readers, are used. 
as S(X1 — X1)(X2 — X:) — 8X — X1)(Xs — Xs) oad 


a2 = 








13 








N-1 N-1 
S(Xi — Xi)? S(X2 — X:)? ‘ 
ay = ’ = ? C. 
11 eg a22 V-1 e 


where X; is the mean of Xi, etc., N is the number of observations in the 
sample, and S is the sum for all observations in the sample. 
The following transformations can be made. 


Qjq = 882P 12, Qig = 8183713, etc. 
ay = 8’, Qe. = 8,7, etc. 


where 8; is the standard deviation of X;, etc. 
It can be shown that in terms of the standard deviations and corre- 
lations 
81 112 — 113723 
bis.s a oe (3) 


82 1 — 937 
and 


b 8 113 — 112723 
ne Se Here 


4) 
83 1 — 123” ( 


It is desired to determine the simple regression coefficient of V; on X3 
when 
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Vi = (Xi — Bi) — dies(X2 — X2).4 

Now 

, - SVi(Xs — Xs) a S[(Xi — X11) — bis(X2 — Xz) |(Xs — Xs) 

oa 5 ey x S(X3 — X;)? 


Q13 — dr2.302s 











33 
Substituting the value of bi2.; from equation (3) and simplifying, 


81 113 — 112723 
by,x, = — ———- 
83 1- 1237 


By equation (4), 
by,x3 —- bis.2. 


Except that more algebraic manipulations are required, it is equally 
easy to show that the simple regression on the final chart of a problem 
involving more variables will equal the partial regression Din.23..-n—1, 
provided that all of the other partial regressions have been estimated 
correctly by the use of drift lines. 


MATHEMATICAL EQUIVALENT OF THE PROCESS OF 
SUCCESSIVE APPROXIMATION 


In the following paragraphs, a mathematical iterative or successive 
approximation method for obtaining the least squares regression co- 
efficients is briefly outlined. The notation applies to a four-variable 
problem. 

In the method of least squares, the coefficients b12.34, bis.24, and dy4.23 
are determined by minimizing the quantity 


S(bre.s4, Dis.c4, biacs) = [(X1 — Xi) — div.a(X2 — Xe) 
— bis.u(Xs — Xs) — drses(Xa — Xs) ]?. 


The solution yields the following three normal equations: 


bi2.s4@22 + D13.24423 + 14.2924 = ie (5) 
Bi2.s4@e3 + 13.2433 + di4.23%4 = Ais (6) 
Dio.seeg + D13.24024 + O14.23044 = us. (7) 





4 For purposes of derivation, it is convenient to express X: and X: in this equation in terms of devia- 
tions from their respective means. Actual values, however, are used in the mechanics of the graphic 
method. Since coding by subtraction does not affect the value of a regression coefficient, the proof ap- 
plies in either case. 
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These equations can be solved by well-known methods, 

In the iterative process, values b‘ 2.34 and b 13.4 are guessed for 
bie.s4 ANd by3.24 respectively in equation (5) and a solution for bys 23 (say 
b44.23) is obtained from this equation. Then b 3.4 and (443 are 
substituted in equation (6) for bis. and by.23 respectively and a second 
approximation for bi2.3 (say b°12.34) is obtained. The values b°);» 34 and 
b‘ 4.23 are substituted in equation (7) for biz. and by4.23 respectively 
and a second approximation to bis. (say 6°13.) is obtained. The val- 
ues b);o 3 and b°);3 4 are substituted in equation (5) for bie. and by, 
respectively and a second approximation to bi423 (say b°14.23) is ob- 
tained. This process is repeated until the coefficients converge to stable 
values. 

The iterative process outlined above is equivalent to the following 
steps: (1) Assign values b(y2.94 and B13 24 to bie.24 and bi3.2 respectively 
in the function f(bi2.s, 613.24, 014.23) defined above. Find that value of 
bis.23 Which makes f(b 12.34, 6“ 413.24, b14.23) & minimum. Let it be b\ 44 93. 
(2) Find that value of bis.3, which makes f(bi2.34, 6“ 13.24, b© 14.23) &@ mini- 
mum. Let this value be b‘)2.34. (3) Find that value of b:3.% which makes 
f(b 12.34, bisz.24, Bb 14 93) & minimum. Let this value be b\) 13 94. (4) Find 
that value of by.23 which makes f(b 2.34, 6° 13.24, 614.13) & minimum. Let 
this value be b‘)4.23, etc. It will be seen that the steps involved in the 
latter process are identical with those for the graphic method involving 
three or more variables, as in each case deviations from the approxima- 
tions to the regression lines for the other independent variables are 
plotted against one of the independent variables and the preceding ap- 
proximation to that regression line is adjusted so that it appears to be 
the line of best fit.® 


PROOF OF CONVERGENCE TO THE LEAST SQUARES VALUES 


The problem of convergence is considered for three variables only. 
The normal equations for three variables are given by 


bio.222 + bi3.2023 
Die.sd23 + b13.20ss 


ai2 


3. 


If the iterative process is performed on these two equations, then the 
Kth approximation to bi2.3 and bi3.2 respectively can be shown to be 
equal to 





5 See Thomsen and Foote, op. cit., pp. 299-304. 
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81 


b©)yo.3 = — (rie — Piss + ire” — riares® + — ++ + — Tat2s"*-*) 
82 
+ b 32, 317237%—? (8) 
and 
1 
b® 3.0 = — (ris — Prot es + riste3? — Tretes® + — + + + + rist2s"X—*) 
83 
Se . 
— — by. gr237X-! (9) 
83 
81 
= — (rig — Pos — Fishes? — Piet es? + — ++ + — Pi2t23"X—*) 
83 
+ B43 or93°X—? (10) 


where b'*);..3 and b,..3 are the Kth and Ist approximations respec- 
tively to by.3 and b3.2 and b,3.2 are the Kth and the 1st approxima- 
tions respectively to bi3.2. But 

$1 


be3 = — (rie — Ti3%23 + Ty2f23” — Tisle3° + — + °° ) (11) 
82 


and 


bis.2 = — (ris — Ty2Fe3 + Pighes” — Pires? + — + *: ), (12) 
3 
which can be obtained by expanding the denominator of equations (3) 
and (4) in an infinite series. 
Hence, comparing equations (8) with (11) and (9) or (10) with (12), 
it will be seen that b,».; and 6,3. can be made to approximate bi:.2 
and b;3.2 respectively as closely as desired by taking K sufficiently large. 


SPEED OF CONVERGENCE OF REGRESSIONS 


The speed with which the successive approximations lead to stable 
results is of interest for two reasons: (1) It takes time to make succes- 
sive approximations and the charts become messy after several sets of 
dots have been inserted on them and (2) if the convergence is too slow, 
the analyst may think that no further correction is needed in the line 
with slope b“,..; when in reality its slope is still quite different from the 
mathematically calculated value. 
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By making algebraic substitutions in equations (8), (9), and (10), it 
can be shown that 


bi2.3 — 6 © 42.3 = 12g?X—2(din.g — B23) and (13) 


8 
bisa — O32 = — - r2a"X—"(bye.3 — 012.3) (14) 
3 


= 1237X—2(big.2 — B39). (15) 


Equation (13) states that the difference between the mathematically 
calculated bi2..3 and any given approximation is equal to a function of 
the correlation between the independent variables times the error that 
was made in the first approximation to b,2.3. It shows that the higher the 
correlation between the independent variables, the slower will be the 
speed of convergence. 

Using the size of the original error (that is bi2.3— 12.3) as a base, it 
can be stated from equation (13) that the percentage of error left after 
the Kth iteration is given by 123" times 100. Thus, if res=0.2, the error 
remaining after one iteration is 4 per cent of the original error and after 
two iterations is 0.16 per cent, while if r2;=0.9, the error remaining 
after one iteration is 81 per cent of the original error. After two itera- 
tions it is 65.61 per cent. 

Equation (13) also indicates the importance of the drift lines, since 
if the error in 6.3 is small, one or two iterations may be enough to 
yield a fairly accurate approximation to the mathematically correct re- 
gressions, but if the error is large and the correlation between the in- 
dependent variables is also large, 6 or 8 or even more successive ap- 
proximations may be required to bring the slope of the regression to 
within 0.1 of the correct value.* It is assumed in the graphic method 
that the successive approximation process is continued until a visual 
inspection indicates that no further improvement is possible. 


SPEED OF CONVERGENCE OF MULTIPLE CORRELATION COEFFICIENT 


The size of the multiple correlation coefficient depends upon the 
size of the deviations (or unexplained variation) from the final regres- 
sions. If the regressions are inaccurate, the computed multiple cor- 
relation coefficient will be inaccurate. In a three-variable problem, if 
Yes is near zero, convergence is rapid and errors in by2.3 and bi3.2 are apt 





§ See Foote and Ives, op. cit., p. 14-18. These equations apply exactly only to a mathematical itera- 
tive process in which an original error is assumed but in which each successive iteration is a mathe- 
matical best fit to the residuals remaining after the previous iteration. The additional error involved in 
the graphic approximation to this is believed to be small except in those cases in which the succeeding 
corrections become too small to distinguish visually. 
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to be small, but any errors made have relatively large effects on Ri.23. 
When rss is near unity, convergence is slow and errors in biz, and bis.2 
may be large, but these have relatively little effect on R12; or on the 
errors in predicting X,. The same general reasoning applies to problems 
involving more variables. 


INTERPRETING THE CORRELATIONS INDICATED 
iN THE SCATTER DIAGRAMS 


In general, the regression lines obtained in the several charts of the 
graphic method have been interpreted correctly as “net,” that is, par- 
tial, regressions between the dependent variable and the separate in- 
dependent variables. Some confusion has occurred in interpreting the 
correlations indicated by the plotted observations in the scatter dia- 
grams. This point can be cleared up if one is careful to note the exact 
meaning of each of the two variables represented by the horizontal and 
vertical scales of the charts, and considers the “visually indicated” cor- 
relation to be the simple correlation of these two variables. 

In the first chart of a three-variable problem, the two variables repre- 
sented by the vertical and horizontal scales are simply the dependent 
variable and one of the independent variables, X2. Hence, this chart, as 
originally plotted, indicates the simple correlation, ri. 

With respect to the second chart, if X:—b12.sX2 is considered as a 
variable, V;, and the simple correlation between V; and X; is obtained, 
the resulting correlation will be equal to the part correlation i372, as 
defined by Ezekiel.” In this sense we can say that the second chart in- 
dicates the part correlation isr2. Likewise, if X1—b:i3.2X3 is considered 
as another variable, V2, and the simple correlation between V2 and X2 
is obtained, that correlation will be equal to the part correlation 1973. 
If X_. were used as the second independent variable instead of X3, the 
second chart would then indicate .r3. Since the final dot. plotted around 
the final regression line in the first chart give the same result as would 


"have been shown in the final chart had the variables been reversed, 


this scatter represents 273. Similar results are given for a problem in- 
volving more variables. The final dots plotted around the final ap- 
proximation to the regression line in each of the charts represent the 
part correlation between the dependent variable and the respective 
independent variable. 

#& Part correlations as such do not appear to have much meaning in the 
interpretation of an actual problem. However, by making certain sub- 
stitutions in Ezekiel’s formula for part correlation, it can be shown that 





1 Ezekiel, Mordecai, Methods of Correlation Analysis, Ed. 2, John Wiley and Sons, Inc., New York, 
1941, p. 213. 
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713.9 


l— (1 maid 713.9) T23” 





1372” = 


Since, in the denominator of this formula, the quantity 1—r*,3.2 is non- 
negative and less than unity, the part correlation between two vari- 
ables is always equal to or greater than the partial correlation between 
the same variables and the difference increases as re3 increases. Because 
of this relationship, charts indicating the degree of part correlation 
can be used as an indication of the approximate size of the partial cor- 
relation. If the indicated part correlation is low, the partial correlation 
must be low, regardless of the correlations between the independent 
variables, as in no case will the partial correlation be higher than 
the corresponding part correlation. If the part correlation is high, then 
either the corresponding partial correlation is relatively high or the 
correlations between the independent variables are very high. In a 
three-variable problem, for example, if the part correlation was 0.9 or 
above, the corresponding partial correlation would be 0.77 or above 
unless 72; exceeded 0.8. 


DEVIATIONS FROM THE REGRESSIONS 


Some investigators have been puzzled by the fact that the deviations 
from the regression lines in certain charts are exactly equal. If the 
mathematically calculated values for the partial regression coefficients 
are obtained, the deviation of the final dot for any given observation 
from the regression line in each chart will be identical. Likewise, if 
calculated values for the dependent variable are plotted against actual 
values and a line is drawn through the origin with a slope of 1, the 
deviation of the calculated value from this line for any given observa- 
tion will be the same as the deviations discussed above. (The simple 
correlation between these two variables equals the multiple correlation 
for the analysis.) This follows from the fact that the deviation for the 
ith observation in each of these charts (for a three-variable analysis) is* 
given by 


di; —_ Xi — bie.3X2i ens bis.2X i. 


Different degrees of correlation are still indicated by the various 
charts, as the degree of correlation reflects not only the deviations of 
the dots from the respective regression lines but also the relative range 
in the dependent variable involved. This is clear from the definition of 
the coefficient of determination, which is the percentage of variation in 
the dependent variable explained by the independent variable. The 
sum of the squared deviations represents the unexplained variation or 
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the total variation in X; minus the variation explained. But to trans- 
late this into a correlation coefficient, the total amount of variation in 
the dependent variable to start with must also be known. In the chart 
indicating the degree of multiple correlation, this is the total variation 
in X;. But in the charts indicating part correlations, it is the amount 
of variation remaining in X; after adjusting for the effects of the alter- 
native independent variables. 


EFFECTS OF NOT PASSING THE REGRESSIONS THROUGH THE MEANS 


In the original description of the graphic method no mention was 
made of drawing the regressions through the means of the variables. In 
his original article, Bean stated: “At this point it may be observed that 
the arbitrary placing of the approximation curves without reference to 
the average values of X, and of the other variables does not affect the 
values of X, computed from the curves. For example, had the approxi- 
mation curve in section 1 been placed higher, the residuals in sections 
2 and 3 would have been correspondingly decreased and the curves 
lowered.”*® The truth of this is fairly obvious. 


APPLICATION TO PROBLEMS INVOLVING MORE VARIABLES OR 
CURVILINEAR RELATIONSHIPS 


Most of the mathematical proofs have been given for problems that 
involve three or four variables. The extension to problems involving 
more variables is obvious, although the algebra becomes complicated. 

The graphic method was developed primarily to handle curvilinear 
rather than linear relationships. The proofs given here are in terms of 
linear relationships because the least squares method, as usually con- 
sidered, is applicable to linear relationships or those that can be trans- 
formed to a linear form. Thus, it is easier to show the relationships 
between the graphic and the mathematical methods by confining the 
discussion to the linear case. It has been generally recognized by 
mathematicians and can easily be demonstrated by example that the 
graphic method provides at least a satisfactory method for obtaining 
approximations to the net regression curves when dealing with multiple 
functional relationships, regardless of whether the nature of the func- 
tion is known. The extent to which the graphic method can be used to 
determine the nature of curves for stochastic or probability relation- 
ships will depend mainly on the degree of correlation and the extent to 
which the sample represents the population. As, in most cases, one 
never knows for sure whether a given small sample is representative of 
the population, any user of regression methods must proceed with cau- 





* Bean, op. cit., p. 393. A mathematical proof is given in Foote and Ives, op. cit., pp. 32-33. 
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tion and must subject his final results to common sense and to any other 
outside checks at his disposal. Some users of the graphic method, as 
well as many users of mathematical methods, have assumed that their 
methods, as such, are sufficiently reliable so that outside checks are not 
necessary. 


SUMMARY 


The method of graphic multiple correlation suggested by L. H. Bean 
essentially is based upon three mathematical principles: 

(1) The multiple regression equation becomes the equation of a curve 
when all of the independent variables except one are held constant. In 
the case of linear regression, the curve is a straight line whose slope is 
equal to the partial regression coefficient between the dependent vari- 
able and that independent variable which is permitted to vary. For this 
reason the slopes of the drift lines in the first chart of a three-variable 
analysis indicate the partial regression coefficient. 

(2) If in a three-variable analysis, the true partial regression line is 
obtained in the first chart, the simple correlation between deviations 
from this line and the second independent variable is equal to the part 
correlation (as defined by Ezekiel) and the simple regression is equal to 
the partial regression. Thus, the line obtained in the second chart of a 
three-variable analysis approximates the second partial regression. If 
the degree of correlation between the two independent variables is low, 
the part correlation nearly equals the partial correlation. Hence, the 
scatter in this chart in most cases indicates approximately the degree 
of partial correlation. 

(3) The method of successive approximation outlined by Bean is 
analogous to a mathematical iterative process which converges to the 
least squares solution. Thus, even if an error is made in the first ap- 
proximations to the regressions, succeeding approximations will tend 
to yield more and more accurate results. 

The speed of convergence depends chiefly on the size of the error in 
the first approximation and the size of the correlation between the in- 
dependent variables. The better the first approximation and the 
smaller the intercorrelation, the faster will the process tend to converge. 
The degree of intercorrelation is determined by the nature of the vari- 
ables included in the analysis and hence, once the variables are chosen, 
very little can be done graphically to speed up the convergence. How- 
ever, the accuracy of the first approximations may be greatly enhanced 
by the use of drift lines. 

The same reasoning can be extended to problems that involve more 
than three variables. 





RECENT ADVANCES IN FINDING BEST 
OPERATING CONDITIONS* 


R. L. ANDERSON 
Institute of Statistics, North Carolina State College 


HIS paper discusses various experimental procedures used to esti- 

mate the optimal point on a response surface and to explore the 
nature of the response surface in the vicinity of this optimum. Multi- 
factor experiments were first set up to investigate one factor at a time; 
then Fisher and Yates introduced the complete factorials for field 
experiments, plus confounded arrangements for incomplete blocks 
designs. More recently, fractional replication designs have been intro- 
duced to cut down the size of the experiments. 

Hotelling devised methods of locating the optimal point using a sin- 
gle factor. Friedman and Savage outlined a sequential one-factor-at- 
a-time procedure when several factors are involved. 

Box and Wilson present a method of locating the optimum and of 
exploring the response surface in which many factors are varied at the 
same time. They present the use of the path of steepest ascent to get to 
a “near-stationary” region if the experimenter starts at a point far 
removed from it. When the experimenter is near such a region, they 
present the use of a composite design to estimate quadratic and inter- 
action effects. The nature of the response surface is explored by the 
use of a canonical transformation. 

The usefulness of these sequential procedures in various experimental 
situations is discussed. 


1. INTRODUCTION 


Most experimentation has as its ultimate objective the estimation of 
some optimal response. However, the lack of a simple experimental 
procedure to achieve this objective has resulted in a tremendous num- 
ber of piecemeal experiments, each designed to pinpoint some section 
of the response surface. This paper will discuss some contributions to 
the problem of maximizing a function, 


°°? $(x1, T2,°*"y Lk); (1) 


where y is the expected response and z; the amount of the ith factor 
used in producing y. For example, a quadratic response function might 
be written in this form: 


* Revision of a paper presented at the 1952 annual meeting of the American Statistical Association. 
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k k k—1 k 
y=Bot+ > bated Bixee? + Dd Di Birias, (2) 
i=] t=] i=l jmi+l 
where £; is called a linear or main effect, 8;; a quadratic effect and 
B:; (tj) an interaction effect. Of course, the optimal response may be 
a minimum (such as with costs), but the procedures are the same for 
determining a minimum as for a maximum. In general the word 
“optimal” will refer to either case. The problem is one of finding the 
level of each factor to achieve optimal y, assuming that the factor levels 
can be continuously varied. The combination of factor levels which 
produces the optimal response will be called the optimal factor combina- 
tion. 
In general, it would be advantageous to know the response function 
itself. For example, most production is carried on for a profit. But the 
optimal factor combination usually will change with a change in the 


factor and product prices. Assume the production function is of the 
form 


Q = Q(X, X2, +++ , x), 


where the parameters of g have been estimated. The profit is then 


k 
«= 9p — > Up, 


t=1 


where p is the price of the product and p; the price of the ith factor. 
Then the static optimal factor combination is given by the solution of 
the k equations 
=  .. P ; Te | (3) 
Ox; Dp 
Of course, the dynamic solution is more complicated, because changes 
in g and the x; can be expected to change p and the pj, especially if this 
product is an important part of the economy. In fact, one might need 
to know the demand function for the product and the supply functions 
for the factors. But the important point to note here is that, once these 
functions have been determined, the determination of the optimum 
requires no more experimentation. 
Experimental procedures for estimating the parameters of a multi- 
factor response function are now being developed. Box [3] discusses 


multi-factor designs for estimating a planar response surface. He shows 
that 
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(i) when prior knowledge of the response surface exists, the design may be 
rotated to reduce possible biases (e.g. quadratic and interaction), and 
(ii) rotation can be used to eliminate such systematic effects as time trends. 


Box and Wilson [2] describe methods of exploring the response surface 
in a “near-stationary” region. 

Before starting any experimentation to explore the response surface, 
the experimenter must select the factors and the factor levels to be 
used in the experiment. The factors are usually decided on the basis of 

(i) previous experimentation and theoretical study in the field, 
(ii) practical consideration of factors which can be varied in the production 


process and in the experiment, and 
(iii) time and facilities available. 


The selection of the factor levels is usually a matter of judgment on 
the part of the experimenter. He considers the possible range of the 
factor levels and previous experience on the differences in levels needed 
to produce detectable response changes, if such exist. These problems 
are common to all the experimental procedures to be discussed in this 
paper. They are largely non-statistical problems, but the statistician 
should be sure that the experimenter understands the importance of 
selecting the correct factors and suitable factor levels. 


2. FACTORIAL EXPERIMENTS 


In the first multi-factor experiments, a single factor was varied at 
a time. For example with 5 factors, one might plan 5/ experiments, 
in which each of the factors in turn was used at J levels while the other 
4 factors were held at some starting level. Fisher [9] and Yates [22] 
encouraged the use of complete factorials and developed a large num- 
ber of special designs involving them. In a complete factorial, all 
combinations of the factor levels are used, e.g., 5 for the above experi- 
ment. These designs were developed for experiments in which the 
experimental error could not be neglected. In order to estimate the 
magnitude of this error in each experiment, the experiment had to be 
repeated several times, say r. These factorial designs were formed large- 
ly for field experiments in which sequential experimentation would be 
less useful than with laboratory experiments, and the factors were 
often of the discrete type, e.g. varieties or rations. 

Because of the large number of factor combinations required in many 
field experiments, it was felt that some form of incomplete block 
design was needed to reduce the experimental error. This resulted in 
the so-called confounded designs, e.g. with 2*, 3*, 3X2", 3X2, 4* designs. 
These are described by Yates [23]. More complicated factorial designs 
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have been constructed by Nair [17, 18], Bose [1], Finney [8], and Lj 
[16], among others. 

When physical scientists and engineers became interested in multi- 
factor experiments, they found that complete and confounded factorials 
required too many experimental units, especially since the experimental 
errors were often much lower than in field experiments. One method of 
reducing the number of experimental units was to use higher order 
interaction effects to estimate the error and hence avoid repetitions 
of the design. Then Finney [6, 7], Plackett and Burnam [19], Kemp- 
thorne [14], Rao [20], and Davies and Hay [5] developed the fractional 
replication designs, based on using parts of the confounded designs, 
Yates [22] and Hotelling [13] had already mentioned the use of such 
designs. A new approach for continuous factor levels has been suggested 
by Box [8]. 


3. LOCATING THE OPTIMAL POINT USING A SINGLE FACTOR 


Hotelling [12] considers in detail the problem of obtaining a maxi- 
mum response when only one factor is involved, advocating the fol- 
lowing experimental procedure: 

(i) An early speculative study of the problem to indicate the range within which 
the optimum lies. 
This study should also include some good theory to help delimit the 
problem. 

(ii) An intermediate stage to supply estimates of the parameters of the response 

function. 

One might use six equally spaced values within the range in (i) to fit a 
fifth degree polynomial and estimate the optimum point é. If several sam- 
ples are obtained at each point, one can also estimate c. 

(iii) A final experiment. 

Let z measure the deviation from ¢ and assume the true response equa- 
tion is 

f(z) = Bo + Bit + Box* + Bz? + Bezt* +--+. (4) 
Assume f(z) can be approximated by a quadratic equation, so that 3 or 
more values of z are needed. 


The estimates of the parameters in the quadratic equation will 
be biased if 6:3, 8 --~- in equation (4) are not zero. Hotelling shows 
how to allocate N sample values so as to make the cubic bias zero and 
the quartic bias a minimum, assuming the variance is fixed. 

Hotelling briefly studied the case of two factors, for which 6 points 
are needed to estimate linear, quadratic and first order interaction 
coefficients. In order to make the cubic bias vanish, he established that 


(i) no 3 points lie on a straight line, 
(ii) no 4 points lie on 2 straight lines through the origin, 
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(iii) no 4 points can be vertices of a parallelogram, 
(iv) the 6 points cannot consist of the origin and the vertices of a regular 
pentagon with center at the origin. 


4. A SEQUENTIAL ONE FACTOR-AT-A-TIME DESIGN 


Friedman and Savage [11] described a sequential multi-factor plan 
to locate a local maximum and to describe the response surface near 
this maximum. They wanted to explore the response surface near the 
maximum in order to 


(i) indicate the seriousness of choosing a factor combination somewhat 
different from the maximum in order to protect other qualities than the 
one studied, 

(ii) determine the relative importance of various factors, 

(iii) serve as a stimulus to develop the theoretical nature of the response, 
(iv) indicate the seriousness of a lack of control of factor levels in the produc- 
tion process. 


They reject the complete factorial design because 


(a) the levels chosen may be far from the maximum, 
(b) if one chooses levels too far apart, he may obtain a very superficial de- 
scription of the response surface near the maximum. 


In addition, they point out that the factorial design is essentially a 
discrete level design and does not take account of the essential con- 
tinuity or ordered character of many factor levels. This is tied in with 
the Hotelling results, which show how one can improve the estimate of 
the maximum by choosing the levels at unequal intervals and by using 
a different number of samples for each level. However, the use of or- 
thogonal linear forms simplifies tests of linear, quadratic, and higher 
components when the levels are equally spaced. 

The Friedman-Savage procedure is as follows: 


(i) Use the best estimate of the optimal factor combination as the initial 
one. 

(ii) Order the factors in some manner. The authors do not say how to do this, 
but one might order them according to his estimate of the possible effects 
of changes of each on the final response. For example, if one factor were 
very important, the experiments might not detect differences for other 
factors unless this first factor were near its optimal level. 

(iii) Vary the levels of only the first factor until an approximate optimum 
was located for it. Presumably the Hotelling idea of fitting a polynomial 
would be useful if the levels were continuous. 

(iv) Using the optimal level of the first factor and the starting point of all but 
the second, find the optimal level for the second; proceed in this manner 
until all factors have been investigated. 

(v) If necessary, repeat another round, but start with the set of local optima 
in (iv). 
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(vi) If the changes in the second round indicate a need for further experimenta- 
tion, it might be advisable to proceed on a path defined by the two sets 
of local optima. This is similar to a device used by Box and Wilson, which 
will be explained later. 


Friedman and Savage suggest that the differences between factor 
levels should be reduced as one gets closer to the optimum. This en- 
ables one to map the surface near the optimum. However, if the experi- 
mental error is very large, this error may mask the small response 
differences near the optimum. 

Friedman and Savage made a number of comparisons, showing the 
smaller number of experiments with the sequential plan as compared to 
complete factorials. However, they did not make comparisons with 
fractional factorial designs. If there are many factors, it is possible to 
use a small fraction of the complete factorial without confounding 
main effects and 2-factor interactions with each other. For example, 
1/8 of a 2'° design will enable one to estimate all main effects and 2- 
factor interactions if all 3 and higher-factor interactions are negligible, 
and similarly for 1/9 of a 37 design [see Kempthorne [15], Sec. 21.7]. 
If previous information indicates that certain of the 2-factor interac- 
tions can be neglected, the designs could be even further fractional- 
ized. 


5. A SEQUENTAL DESIGN VARYING MANY FACTORS AT A TIME 


A recent article by Box and Wilson [2] is devoted to the problem of 
determining optimal factor combinations in chemical investigations. 
Their methods also enable the response surface to be described in the 
neighborhood of the optimum. The discussion of these methods in the 
1951 article was condensed for publication. After discussion with Dr. 
Box, this writer believes the following is a correct description of the 
Box-Wilson techniques. 

(i) Conduct some initial experiments in the vicinity of the previously 
known best factor combination. These initial factor combinations prob- 
ably would be based on a complete or fractional factorial design, usually 
of the 2* type. The 2* designs are simple to analyze and interpret and 
give good estimates of main effects and 2-factor interactions. 

(ii) If the main effects in step (i) are large compared to the 2-factor 
interactions (i.e., the response surface is roughly planar in the region 
of these initial factor levels), the experimenter would be led to try new 
factor levels which are changed in the direction of largest response in 
the initial experiments. Box and Wilson explore with new experiments 
the path of steepest ascent (or descent if a minimum is desired), in which 
each factor is varied proportionally to its unit effect in the initial ex- 
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periments. The procedure of steps (i) and (ii) is repeated until the 
first order effects are small, so that no further progress is possible by 
this method.! The experimenter is then brought to a near-stationary 
region. 

A technique is provided for avoiding gross errors in selecting the 
ranges of the factor levels in the initial set of experiments. 

(iii) When the experimenter has reached a near-stationary region, 
he conducts some additional experiments specifically designed to esti- 
mate the quadratic and interaction effects in equation (2). The 3* 
designs have been developed to do this; however, the size of a 3* experi- 
ment becomes unwieldy for large k. One notes that the 2-factor inter- 
action effects for a 3* experiment can be divided into four groups: 
linear Xlinear; quadratic Xlinear; linear Xquadratic; and quadratic 
Xquadratic. Presumably the latter effects (which are of the fourth 
degree) would be negligible, and perhaps the middle two groups of 
effects (which are of the third degree). Hence one might like to use a 
design which would enable him to estimate only the linear, quadratic 
and linear Xlinear interaction effects [the parameters in equation 
(2)]. 

Box and Wilson’s composite design was developed to accomplish this 
purpose. One form of this design is to add (2k+1) experiments to the 
last set of 2* experiments in step (ii). If we designate the factor levels 
at the center of this 2‘ design as (0,0, - - - ,0), the new factor combina- 
tion would be 


(0, 0, aa , 0); (+a, 0, Saha , 0); (0, +a, sada ,9); iit tae ; (0, 0, lak , ta) 


a can be determined either so that the design is orthogonal (the esti- 
mated effects are all non-correlated) or so that the second order effects 
are estimated with equal precision. If the factorial experiments had 
indicated that the optimum was near one of the corners of the factorial 
design, the center for the composite design could be located at this 
corner. 

(iv) Once the experimenter has obtained rather stable estimates 
of the parameters in equation (2), the optimal factor levels, 


2° = (x1°, a°,>°°, rr°), (5) 


can be estimated. The predicted response for this factor combination 
is 


y? = bo +} Lbia., (6) 


1 When the response becomes almost stationary in the first path, a new set of 2* experiments is 
conducted to determine if the first order effects are small in this new region or if a new path should be 
followed. It should be pointed out that if the main effects had been small compared to the interaction 
effects in the initial experiments to step (i), step (ii) would be omitted. 
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where b; is the estimate of 6; in (2). After shifting the origin of the 
system to x°, the quadratic form (2) can be reduced to the canonical 
form 


9=y9 + DAX? (7) 


where 9 is the estimate of y and X; are linear functions of the z;. These 
X; are the axes of a coordinate system with center at 2°. 
(v) The following tentative conclusions could be made: 


(a) If 2° is not far removed from the experimental center and the ); in (7)are 
of the same sign, the experimenter can conclude that y® is near the true 
optimum response. He probably would conduct several confirmatory 
experiments with factor combinations near z° and then reevaluate 2’ 
until fairly stable results were obtained. 

If one of the )’s is small relative to the others, the response surface has a 
ridge along the corresponding X-axis. Dr. Box states that, “often the 
most important practical problem is to determine the nature of the local 
ridge system.” If a ridge is present, the experimenter can then use as the 
optimal factor combination the one along this ridge which is cheapest or 
easiest to use or the one which produces the optimal response for some 
other characteristic. 

If some of the larger \’s in (7) are of opposite signs, the experimeter is 
at a saddlepoint; Box and Wilson outline additional experiments to use 
in this case. 

If x° is far removed from the experimental center, equation (7) could not 
describe the surface at z°. In this case one would suspect the existence of a 
rising ridge along the axis of X;, say, with a small value of A, in (7). It 
would not be advisable to shift to an origin on X, but near the experimental 
center and obtain as a substitute for equation (7): 


k 
G9 = yy + Bik +X? + DMX, (7’) 
m3 


where y’ is the predicted response at this new origin. The experimenter 
would then explore along the X,’-axis. 


An independent successful use of the Box-Wilson techniques is 
given by Read [21]. 


6. CONCLUSIONS 


This paper has presented some recent ideas on the use of sequential 
methods to estimate optimal factor combinations and to explore the 
nature of the response surfaces in the vicinity of these optima. The 
reader should be cautioned that the success of these sequential pro- 
cedures depends on the following conditions: 

(i) The experiments can be sequentialized. 
(ii) The factor levels can be varied continuously. 
(iii) The experimental error is small and generally well estimated in advance. 


If some of the factor levels are discrete, it may be necessary to locate 
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an optimal combination of the other factors for each discrete level 
and use the best of the local optima. However, it may be possible to 
find characteristics of the discrete factors which are continuous, for 
example, genetic features of varieties, chemical compositions of differ- 
ent soils, or average educational or economic features of different 
human groups. Hence, one of the objectives in future research may be 
the quantification of qualitative factors. 

The use of sequential procedures in biological and social experimen- 
tation may be limited because of the length of time required to conduct 
the experiments and the presence of large experimental errors. In many 
cases, however, response changes over time can be measured by the 
introduction of additional experimental factors. If such time changes 
can not be estimated directly, it may be necessary to use some control 
factor combinations with every new set of factor combinations. If 
controls are needed in the sequential procedure, it may be more efficient 
to use larger initial factorial experiments. Kempthorne [15] discusses 
the use of fractional factorials in incomplete blocks design. If repli- 
cations are needed to estimate experimental errors, replicate experi- 
ments also can be performed in sequential experimentation. It would 
appear that, even in the biological and social fields, the sequential 
methods discussed here should be useful in planning many long-term 
experiments. Here is a place for coordinated research at several re- 
search centers—to avoid duplications and serious omissions in the 
factor combinations used.? 

Two methods of conducting multi-factor sequential experiments 
have been discussed: the use of one-factor-at-a-time and the Box- 
Wilson procedure of varying several factors at once. Another method 
might be mentioned—a procedure based on the random selection of 
factor combinations. It would be useful to have these three methods 
compared in various experimental situations. This wouid seem to be 
a useful statistical research project. As more response surfaces are 
explored, it will be useful to know how many of them have ridge sys- 
tems. Presumably the one-factor-at-a-time approach will not be very 
efficient in the exploration of a ridge. In particular, this approach 
would not tell the experimenter that the optimal factor combination 
can be located anywhere along this ridge. In most production, several 
responses must be optimized at the same time; hence, a good descrip- 
tion is needed for each response surface, for example, costs of produc- 
tion, yield, and quality of product. 





2 Fisher [10] discusses the use of a sequential experimentation in a genetics experiment and Bross 
[4] in medical experiments; however, neither of these articles is concerned with the estimation of an 
optimal factor combination. 





798 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


REFERENCES 


[1] Bose, R. C., “Mathematical theory of the symmetrical factorial design,” 
Sankhya, 8 (1947), 107-66. 

[2] Box, G. E. P., and Wilson, K. B., “On the experimental attainment of op- 
timum conditions,” Journal of the Royal Statistical Society, Series B, 13 
(1951), 1-45. 

[3] Box, G. E. P., “Multi-factor designs of first order,” Biometrika, 39 (1952), 
49-57. 

[4] Bross, I., “Sequential medical plans,” Biometrics, 8 (1952), 188-205. 

[5] Davies, O. L., and Hay, W. A., “The construction and use of fractional 
factorial designs in industrial research,” Biometrics, 6 (1950), 233-49. 

[6] Finney, D. J., “The fractional replication of factorial arrangements,” 
Annals of Eugenics, 12 (1945), 291-301. 

[7] Finney, D. J., “Recent developments in the design of field experiments, 
III. Fractional replication,” Journal of Agricultural Science, 36 (1946), 
184-91. 

[8] Finney, D. J., “The construction of confounded arrangements,” Empire 
Journal of Experimental Agriculture, 15 (1947), 107-12. 

[9] Fisher, R. A., The Design of Experiments, 1st edition. Oliver and Boyd, 
Edinburgh and London, 1935. 

[10] Fisher, R. A., “Sequential experimentation,” Biometrics, 8 (1952), 183-87. 

[11] Friedman, Milton, and Savage, L. J., “Planning experiments seeking max- 
ima.” Chapter 13 of Techniques of Statistical Analysis, edited by Eisenhart, 
Hastay and Wallis. McGraw-Hill Book Co., New York, 1947. 

[12] Hotelling, Harold, “Experimental determination of the maximum of a 
function,” Annals of Mathematical Statistics, 12 (1941), 20—45. 

[13] Hotelling, Harold, “Some improvements in weighing and other experi- 
mental techniques,” Annals of Mathematical Statistics, 15 (1944), 297-306. 

[14] Kempthorne, O., “A simple approach to confounding and fractional replica- 
tion in factorial experiments,” Biometrika, 34 (1947), 255-72. 

[15] Kempthorne, Oscar, The Design and Analysis of Experiments, John Wiley 
and Sons, Inc. New York, 1952. 

[16] Li, Jerome C. R., “Design and statistical analysis of some confounded 
factorial experiments,” Iowa State College Agricultural Experiment Station 
Bulletin 333, (1944). 

[17] Nair, K. Raghavan, “On a method of getting confounded arrangements in 
the general symmetrical type of experiments,” Sankhya, 4 (1938), 121-38. 

[18] Nair, K. R., “Balanced confounded arrangements for the 5; type of experi- 
ments,” Sankhya, 5 (1940), 57-70. 

[19] Plackett, R. L., and Burnam, J. P., “The design of optimum multi-factor 
experiments,” Biometrika, 33 (1946), 305-25. 

[20] Rao, C. R., “Factorial experiments derivable from combinatorial arrange- 
ments of arrays,” Journal of the Royal Statistical Society Supplement, 9 
(1947), 128-39. 

[21] Read, D. R., “The design of chemical experiments,” accepted for publica- 
tion in Biometrics, (1953). 

[22] Yates, F., “Complex experiments,” Journal of the Royal Statistical Society 
Supplement, 2 (1935), 181-247. 

[23] Yates, F., The Design and Analysis of Factorial Experiments, Imperial Bu- 
reau of Soil Science Technical Communication No. 35, (1937). 





A NOTE ON REGRESSION WHEN THERE IS 
EXTRANEOUS INFORMATION ABOUT ONE 
OF THE COEFFICIENTS 


J. DurBin 
London School of Economics 


1, INTRODUCTION 


UPPOSE we have a sample of n observations corresponding to the 
regression model 


Y =a+ BiXi + B2X2 + «, 


where the n values of ¢ are independent of each other and of the z’s and 
have zero means and variance o*. In addition to this sample we are 
given from outside an unbiased estimate b, of 6, together with an un- 
biased estimate s,? of o;? its variance. What is the best way of using this 
information to estimate 82? 

Situations of this kind arise in econometric work in combining cross- 
section and time-series data. For instance, in a demand study we may 
wish to estimate the price elasticity of demand from a time series of 
observations using at the same time an estimate of the income elasticity 
obtained from a budget survey. 

This problem was put to me when I was a research worker at the 
Department of Applied Economics, Cambridge, by my colleague, 
M. J. Farrell. Later developments were worked out in co-operation 
with Richard Stone, who kindly supplied the data for the numerical 
example. 


2. SIMPLE METHOD 


The simplest procedure is to accept b; as the estimate of 6; and to 
estimate 8. by considering the regression of Y—b,X; on X2. Denoting 
by ¥, 21, 22 the deviations of Y, X:, Xe from their sample means, the 
estimate of B: is 


- (y — byx1)x2 
Do 2? 
p Toy — by > T1%2 
aE 





b. = 





For given by, 


,s %1X2 ; 
EN 


E(bs| b:) = Be — (b: — Ai) 


799 
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Thus bz is conditionally biased. However E(bi)=(:, so that as }, 
varies E'(b2) = Bs», i.e. bz is unbiased. For fixed z’s the variance of by is 


V(be) = s[* Ly — Fits om (b1 — patel 


o* 4g? (Do x22)? 
ene dp Qa 
Le (D2)? 


2 
+ og 17b12?, (2) 





Co 
Do 22" 


where };2 is the regression coefficient of 2; on 22. 

We can compare this with the variance that would have been ob- 
tained if the extraneous information had not been used. In that case 
the coefficients would have been estimated by least squares, the vari- 
ance of the estimate of 82 being 








o2 
D at(1 = 14) 
where r is the observed correlation between z; and z2. Now 
o? rig? rg; >, xy? . 


V(b:) = 





Sad-) Sat-y” Dae 


Thus the diminution in variance due to the use of the extraneous in- 


formation is 
- ] AD 
Lad) 9 Sat’ 


which is always positive if 








o? 
03? < ’ 
* * Dard =) 


i.e. if the variance of the extraneous estimate of f; is less than the vari- 
ance of the internal least-squares estimate of 6:. When r=0, there is no 
improvement, as is otherwise obvious since the estimate of 82 is unaf- 
fected by the extraneous information about §:. 

To estimate V(bz) we need an estimate of o?. This can, of course, be 
calculated from the internal least-squares analysis in the usual way, 
and this will generally give the most convenient estimator to use in 
practice. A slightly more efficient estimate which takes into account 








2 1953 


as b, 


(2) 


ob- 
case 
rari- 


ari- 
af- 
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in 
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the information contributed by the external estimate of 6; can, how- 
ever, be obtained as follows: 
> (y — bit: — Brt2)? = DX {(y — bias — boxe) + (b2 — B2)x2}? 

= >) (y — bits — bare)? + (be — Be)? D> 22. 


Taking the expectation of the left-hand side we have 


ED (y — biti — Bate)? = ED {(y -- Bits — Bots) — (bi — Biri}? 
= (n — 1)o? + 0?) a. 


(The factor n—1 occurs instead of n since the observations are meas- 
ured from the sample means.) 


o1?( 2102)? 
- fro 


a m (2). 


E(b2 — 62)? >, r2? = o? + 





Thus, 
(n — 2)o? = E> (y — byt, — bere)? — (1 — r)o,2 >, x17. 


Consequently an unbiased estimate of o? is given by 
s? = — { > (y — biti — bere)? — (1 — r)s,? >, a7}. 
The first term in the bracket may be evaluated by means of the iden- 
tity 
Di (y — bits — bare)? = Do (y — dins)* — b2 DO (y — bits) ae 
= Dy? = 21D ay + by? DO 21? — bs? DO a’. 


Substituting s? and s,? for o? and o;* in (2), we get the unbiased estimate 
of V(b2), 





in 1 
jal» al= ee ee 
+ 3? o — 1)(>0 xz) -¥ ot |. (3) 





Ey 


Unfortunately this is not distributed as a multiple of x? in the normal 
case, and cannot therefore be used to construct an exact ¢ test of be. 
For sufficiently large samples an approximate test may be obtained 
by regarding (b2—6:2)/+/V (bz) as a normal variable with zero mean 
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and unit variance. Alternatively, a better approximation could be con- 
structed by the method proposed by Welch [3]. 
Similar results are found for the general regression 


Y = at BX + B2X2 +--+ + BXe + €, 


where we have an extraneous estimate b, of 6;. Let 82 denote the vec- 
tor {B2,---, Be}, y, x: the vectors of deviations from sample means 
of Y, X;, and let X, denote the matrix of deviations from the sample 
means of X2, ---, Xx. Then the estimate of 8: obtained by straight- 
forward substitution of b; for B; is 


be. = (Xo! X2)—"X2'(y — dix). 
The variance matrix of this set of estimates is 
V (be) = 0?( Xo! X2)—! + 0?( Xo! X2)—1Xo'x x1’ X2(Ko’ Xo)! 
o?(X2'X2)—! + o*bibr’, 


where b;2 is the vector of sample regression coefficients of x; on 22, - - - 
z,. An unbiased estimator of o* is given by 


1 
= —— [> &Y — bin: ++ — bax)? — (1 — Rv*)s:? Do 217], 


where R,2 is the multiple correlation between 2; and 22, - - + , XZ. 
A slightly more efficient estimator of the multiple correlation of y on 
z, and x2 than that given by least squares is R defined by 


(n — k + 1s? 
Ly? 


3. EFFICIENT METHOD 





1- R= 


The above procedure, though simple and direct, does not make the 
most efficient use of the available information, since no attempt is 
made to improve the estimate of 8; by means of the multiple regression 
data. The question of efficiency can be explored by considering the fol- 
lowing general problem. 

Suppose that b, is a vector of unbiased estimators of a set of param- 
eters 6i={f1,---, Br} and that bz is an independent vector of un- 
biased estimators of the extended set 6={(:,---, &} where k2h. 
The variance matrices V(b;)=V; and V(b:)= V2 are assumed to be 
known and of ranks h and k respectively. We seek the best unbiased 
linear estimators of 61, - - - , Bx, ie. those which are linear in the ele- 
ments of b; and bz and whose variances are not greater than the vari- 
ances of any other unbiased linear estimators. 
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Now 
I, 0 
ho |g, 
® hu 


where J, Ix, are unit matrices of orders h and k—h, and 0 represents 
any matrix, all of whose elements are zero. Also 


bi Vv, O 
v| ‘| =v sy = [ |. 
b 0 Vv. 


We now apply Aitken’s [2] extension of Gauss’s least-squares theo- 
rem. This states that if E(x)= Pa and V(x)=V, P being a known ma- 
trix, then provided that V- and (P’V~'P)~ exist, the vector of best 
unbiased linear estimators of the elements of a is (P’V-'P)P’ Vx. 

In the present problem, 


0 


WwW, 0 
Vi 0 WwW, 0 
Vo = = = 1/0 Wa We 
0 1 fone 0 W: 0 


Wr, Wu 


say, Where W2; is the matrix formed by the first h rows and columns of 
W:2; Wee, We, and Wx are defined similarly, also 


I, 0 
P=|I, 0 
0 


0 


W, Wa ry 


We; Wu 
so that 


W, + Wa Py, = Wi" + W, 


P'V-1P = | 
Ws; Wu 


where W,* is the kXk matrix in which W2 occupies the leading posi- 
tion, the remaining elements being zeros. Similarly 


P'V-1x = W,*b;* + Webs, 
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where b,* is the kX1 vector in which the first h elements are the ele- 
ments of b,, the remainder being zeros. 

Hence, applying Aitken’s result, the best unbiased linear estimators 
of the elements of § are given by 


b= [wW,* + W:|-! [Wi*b,* + Wb: |. 


The inverse matrix [W,*+W:]— exists since W,* and W: are positive 
semi-definite and positive definite and hence Wi* + W, is positive 
definite. The normal equations therefore take the form 


[W.i* + Wo]b = Wi*b,* + Webo. (4) 
The variance matrix is 
V(b) = [Wi* + W2}-. (5) 


These results are of particular interest since they illustrate the straight- 
forward fashion in which Gauss’s theorem may be generalized to deal 
with the estimation of parametric vectors rather than scalars. 

In the application to the problem considered above, b; consists of 
the single element b; with variance o;?. Let bz denote the vector of least- 
squares estimates of the 6’s obtained from the regression observations 
by ignoring the extraneous information, i.e. b2=(X’X)"X’'y, where X 
stands for the matrix of deviations from the sample means of X;, - + -, 
XxX ke Then 


V(b:) = 0°(X’X)-', 
so that 


1 
W, = — X’X, 
o? 


1 
Web: 2 <= X’y. 
o? 


Thus the normal equations (4) take the form 
yf b; 
1 1 a 1 1 
—]0 0--- | +—X’X!6 = —| 0/+— xX’ 
] ie € - ; - 


01 o;” 


where for convenience we write 6 for b, i.e. 
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A(t? +) + bed ite + ++ + Be DD tite = Do ty + Ar 


o> L1X2 + Bd ry? +-:- + Bed Lot = Do ty (6) 


A>: Lilk + p>: Leite tess + > x” = S aw, 


where 


The variance matrix is, from (5), 


> 21? +A, > m22,---, p> LiXk 
) %1%2, } x2? ¥ 


V(6) = o? 








on Ee Sx? J 


The only difference from the ordinary least-squares expressions is the 
addition of \ to >-2:*. Thus if \ were known it would be no more diffi- 
cult to perform the efficient analysis than the least-square analysis. 
The difficulty is, of course, that in practice \ will not be known. It may, 
however, be estimated by beginning with a least-squares analysis of 
the regression data to get an estimate of o*. Confidence limits can be 
put on d by the ordinary variance-ratio technique. The increments 5b 
to be added to the least-squares estimates may then be found from the 
equations 


5bi( >> x;? + d) + 5b: >> T1722 + Ss + 8b, >, 1t, = A(bi _ by’) 


bb: Du L122 + bbe >) a? +--+ + bbe DU Let, = 0 7) 


bbe > ate + -++ +d Dm? = 0 


where },’ is the least-squares estimate of §,, as can be seen from (6). 
The calculation of the increments for the upper and lower conficence 
limits of \ will give an idea of the sensitivity of the estimates to varia- 
tions in X. 

It is worth noting that the efficient estimate of 8: may be calculated 
directly without solving the equations (6). This is done by taking the 
weighted mean of b; and the least-squares value b,’, the weights being 
the reciprocals of the respective variances. Thus 
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bi by’ 


2 12 


o1 


1 1 


o;? o;”? 


o1 





, (8) 


where o;’*=V(b;') and is the top left-hand element of the variance 
matrix o?(X’X)-. This value will be found to satisfy the equations 
(6). In practice o? is unknown, but an unbiased estimate of it can be 
calculated in the usual way from the least-squares analysis. 

In any particular case we must therefore decide whether to calcu- 
late 6, - - - , 8, simultaneously as the solution of (6) (or, equivalently 
of (7)) or alternatively whether to calculate #, from (8), the remaining 
coefficients being obtained from the last k—1 equations of (6). If X’X 
has been inverted during the least-squares analysis, so that o” or its 
estimate is known, it will obviously be easier to calculate A: directly 
from (8). If, on the other hand, the least-squares normal equations have 
been solved without inverting X’X it will be easier to solve (6) or (7) 
rather than invert X’X first. 


4. NUMERICAL EXAMPLE 


The foregoing results will now be illustrated by means of data on the 
consumption of pork in the United Kingdom 1920-38. We wish to fit 
a regression of the form 


Y = a+ BiX1 + B2X2 + «, 


where Y is log consumption of pork per head, 
X; is log income per head, 
X > is log relative price of pork, 
f: is the income elasticity, and 
B2 is the price elasticity of demand for pork. 


The analysis of a set of family budget data! yields the value }; 
=0.575798 as the estimate of 6:1, with an estimated variance of 
0.0558764. From the annual figures of aggregate consumption, income, 
etc. we obtain the values of sums of squares and products of deviations 
from the means, 





1 For further information about the data see the article by Stone [2]. One point that may be men- 
tioned here is that the time-series observations were transformed by taking first differences before 
calculating the sums of squares and products, in order to reduce the effect of serial correlation. 
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> y? = 0.0352895 > zy = — 0.0023041 
> 2x? = 0.0086619 >> xy = — 0.0235779 
> x2? = 0.0410816 > 122= 0.0070014. 


A least-squares analysis of the time-series data gives the estimates of 
Bi and Be 
bi’ = 0.229520 and b&b’ = — 0.613043, 


with estimated variances, 
V(b:’) = 0.190701 and V(b2’) = 0.040208. 

For the simple method described above we accept the value bh 
=0.575798 as the estimate of 6; and take as our estimate of 82 the value 
given by (1), i.e. 

— 0.0235779 — (0.575798) (0.0070014) 
0.0410816 





— 0.672060. 


For the estimate of variance of b2 we need 
> (y — bits — bers)? = 0.035295 + (0.575798) { — 2(— 0.0023041) 
a. (0.575798) (0.0086619) } — (0.4516646) (0.0410816) 
= 0.0222595. 


Substituting in (3) we have for the unbiased estimate of the variance 
of be, 





m 1 
V(b:) = | 0.022595 
16(0.0410816) 


17(0.0070014)? 
0.0410816 





+ (0.0558764) { - 0.00s66ia} | 


= 0.0348527. 


This may be compared with the figure of 0.0362922 obtained by sub- 
stituting the least-squares estimator of o? together with s,? for o;? in (2). 
The apparent reduction in the variance is of course due simply to the 
use of a more efficient estimate of o?; the actual variance is unaffected. 

To calculate the efficient estimates of 6; and 62 we need first an esti- 
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mate of ». The estimated residual variance of the time-series data js 
0.00142427, whence 
0.00142427 


\ = ————_ = 0..0254897. 
0.0558764 


Substituting in (6) we have the equations 
(0.0086619 + 0.0254897), + 0.00700144: 
= — 0.0023041 + (0.0254897) (0.575798), 
0.0070014f, + 0.04108168. = — 0.0235779, 

which yield the estimates 

B: = 0.497328 and #, = — 0.658687. 
The estimated variance matrix is 
r §0.0341516  0.00700147-" 
| 0.0070014 0.0410816 
r 0.0432143 —0.00736497 
| -0.0073649  0.0359245]° 


0.00142427 








Thus the estimated variances of the estimates of 2 given by least 
squares, the simple method and the efficient method respectively are 


0.040208, 0.034853 and 0.035924. 


These values illustrate the gains achieved by the methods developed 
above, in comparison with the least-squares method. As it happens, 
the estimated variance given by the simple method is smaller than that 
given by the efficient method; this is presumably due to sampling fluc- 
tuations. 
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A HOLLERITH TECHNIQUE FOR THE SOLUTION 
OF NORMAL EQUATIONS 


M. J. R. Hearty anp G. V. Dyrxe 
Rothamsted Experimental Station 


N THE critical study of the results of large-scale sample surveys it is 

frequently necessary to consider data classified in several different 
ways, and to attempt to disentangle the effects of the various classifica- 
tions which will usually not be orthogonal to one another. One way of 
doing this is to fit constants by least squares, assuming the effects im- 
plied by the classifications to be additive; for a discussion of the meth- 
od, see Yates [6, p. 137 et seg.]. The process of fitting involves the solu- 
tion of the normal equations, a set of simultaneous equations equal in 
number to the total number of categories in all the classifications. If 
this number is at all large, the computations become very lengthy, 
and it is desirable to use Hollerith machinery, more especially as the 
main computations of the survey will often be done on Hollerith ma- 
chines, and the data will already be punched on cards. 

At least two methods of solving simultaneous equations with the aid 
of Hollerith machines have been published [3, 4]. Both employ a tech- 
nique of pivotal condensation, and demand the use of a range of 
machines outside the scope of a small installation. In the present con- 


text the large number of equations may lead to a serious accumulation 
of rounding errors, and there are advantages in the alternative tech- 
nique of successive approximation, as described, for example, by Stev- 
ens [5]. The present paper gives a method of mechanising this tech- 
nique, using only the basic Hollerith machines, the sorter and tabu- 
lator. For producing the working pack, a reproducer is desirable though 
not essential. 


THE METHOD OF SOLUTION 


The method of solution will be illustrated on a small scale by means 
of an example with three classifications used by Stevens [5]. The com- 
putations will be set out in some detail, and the process of mechanisa- 
tion can then be briefly explained. 

The necessary data, abstracted from the complete table given by 
Stevens, is shown in Table 1. Here we have 2-way tables showing the 
number of units in the various sub-classes, and the total number of 
units and total “yield” for each category of the three classifications. 
From this table, the normal equations (Table 2) can be written down 
immediately; the diagonal terms come from the marginal totals, the 


809 
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TABLE 1 








Litter 
Totals Yield 





2 3 








7 23 13 
602 1982 886 











TABLE 2 








+ 344+ 2h42/3+2),+ 68:4+ 338,= 572 

+ 34+ 324+2/);+ 4+ 681+ 3s,.= 748 

9d; ad 24+ 3l. +21; +21,+ 581+ 43. = 733 
9d,+ 2h+ 3l, +21; +21, + 63, + 332 = 815 


3d, +3d, +2d; +2d,; +101; 4. 781 + 382 = 734 
2d; +3d, +3d; +3d, + 1 1l, + 63; 4 582 = 819 
2d, +2d, +2d; +2d, +8i; + 68; aa 282 = 713 
2d:+ d2+2d3;+2d, +714,4+ 48:+ 38:= 602 


6d, +64, +5d;+6d,+ 71+ 6l2+61; +41,+23s; = 1982 
3d, +3d2+4d3+3d,+ 34+ 512+21;+3% +13s2= 886 





TABLE 3 








ly ls 





333 
-333 
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other coefficients from the body of the table, while the right-hand sides 
of the equations are the total yields. The basic problem is the solution 
of these equations. Notice that they are not independent; in fact what 
we determine are the differences between diets, between litters and be- 
tween sexes. 

The first step is to divide each equation by its diagonal term, obtain- 
ing the coefficients set out in Table 3. These equations with rounded-off 
coefficients are those which we actually solve, and this table eventually 
serves as & punching schedule for the working pack. As first approxima- 
tions we take the straight means of each category, which appear as 
the right-hand sides of the equations in Table 3. However, as only 
differences are to be estimated, we can add or subtract any convenient 
quantity from each set of values, and in practice we subtract the small- 
est value in each set from the others. Thus, subtracting 63.56 from the 
d’s, 73.40 from the l’s and 68.15 from the s’s and retaining 3 significant 
figures, we arrive at the first column of Table 4. Inserting these ap- 
proximations into the first four equations, we obtain improved values 
for the d’s, as follows:— 

d, ~63.56 — (.833 X0.0 +.222 X1.0+ - - - +.333 X0.0) =45.0494 
dz ~83.11 —(.333 X0.0 +.333 X1.0+ - - - +.333 X0.0) =65.8870 


d; ~81.44 — (.222 X0.0+.333 X1.0+ --- +.444X0.0) =64.8164 
d, ~90.56 — (.222 X0.0 +.333 X1.0+ - - + +.333 X0.0) =71.9384 


A 


We subtract 45.0494 from each of these new approximations, round off 
to one decimal and use them in the next set of equations to get im- 
proved values of the l’s— 
l, 73.40 —(.300 X0.0 +.300 X20.8+ - +--+ +.300 0.0) =45.2200 
l, 74.45 —(.182 X0.0 +.273 X20.8+ --- +.455 X0.0) =46.2125 


ls 89.12 —(.250 X0.0 +.250 X0.0 +--+ +.250 0.0) =58.7450 
1, ~86.00 — (.286 X0.0 +.143 X0.0 +--+ +.429X0.0) =59.3914 


B 


Subtracting 45.2200 and rounding off, we go on to the next two equa- 
tions— 


8, *86.17 — (.261 X0.0 +.261 X20.8+ +--+ +.174X14.2) =63.1684 Cc 
8: 68.15 — (.231 X0.0 +.231 X20.8+ +++ +.231 14.2) =45.2887 


We thus arrive at the set of second approximations in Table 4. Repeat- 
ing the whole cycle we obtain 3rd approximations, and these are found 
to be unaltered by further cycles. 

Solutions correct to three figures have now been obtained and in view 
of the rounding off of the coefficients no further accuracy can normally 
be achieved by this technique without modification; three figures will 
in any case be sufficient in sample survey work. It is convenient to 
make final adjustments to make the mean of each group equal to the 
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TABLE 4 

Approx. Stevens’ 

Ist 2nd 3rd solution solution 

d, 0.0 0.0 0.0 62.7 62.665 
d, 19.6 20.8 21.0 83.7 83 .678 
d; 17.9 19.8 19.8 82.5 82.426 
d, 27.0 26.9 26.9 89.6 89.555 
h 0.0 0.0 0.0 72.4 72.399 
ls 1.0 1.0 1.0 73.4 73.391 
ls 15.7 13.5 13.5 85.9 85.949 
ly 12.6 14.2 14.2 86.6 86.594 
8 18.0 18.0 17.9 88.5 88.505 
82 0.0 0.0 0.0 70.6 70.662 








general mean, and these figures are given in Table 4 together with the 
solution obtained by Stevens. 


THE HOLLERITH TECHNIQUE 


It is apparent from the cycle of iteration set out in full above that the 
basic operation is the computing of sums of products, Sry say, where 
the x’s stay fixed throughout the problem (they are in fact the coeffi- 
cients in Table 3). It is natural therefore to punch these quantities on 
cards. The actual multiplications are done by successive addition, as 
on a desk calculator. Thus if 123 is to be multiplied by 456, we pass 
through the tabulator 4 cards punched 12300, 5 cards punched 1230 
and 6 cards punched 123. 

The construction of the working pack will now be described in detail. 
One set of 27 cards as described in the previous paragraph is used for 
each column of table 3, that is, for each variable, the units in the di- 
agonal being omitted. Leaving columns 1-5 for indicative material, 
the basic pack will be punched as follows, (- denoting a blank column) 




















Col. no. 6 10 15 20 25 30 
Card no. la ----- j-----!-----!----- 300--:182 ete. 
et en rd 300--;273 
Ba - - - - — f- - - - 1 - - - - -f - 200--:273 
4a ----- }-----!-----}----- 200--:273 
Sa 333--1333--!}222--3;222--;----- --- 
etc. 


Each of these a cards is copied a further eight times. Nine more cards 
are then punched for each variable, with the information in cols. 6-78 
transferred to cols. 7-79; these will be referred to as cards 1), 2b, - - - 
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A set of c cards is similarly punched with the information appearing in 
cols. 8-80. 

The indicative matter in cols. 1-5 is used for sorting, controlling and 
checking. All a cards are punched X in col. 4, b cards are punched 1 
in this column, and c cards are punched 1 in col. 5. By leading these 
columns to a counter and using the “29 feature”! a check can be made 
that the right multipliers are used at each stage. Column 3 is not 
needed in the present example, but in almost all practical cases the 
number of equations will be such that two or more cards will be needed 
for each variable; these can be distinguished by punching in this col- 
umn. 

To form the multipliers, the correct number of each type of card 
are picked out by hand from the pockets of the sorter. Using always 
3-figure multipliers, the 12 pockets of the machine will hold 4 variables 
at a time, so that all cards 1-4 are punched 0 in col. 1, all cards 5-8 
are punched 1, and so on. Control is made possible by over-punching 
XY, X, Y or nothing to distinguish the variables in each set of four, 
these punches being ignored by the sorter. The punching in col. 2 is 
designed to bring the cards into the sorter pockets in the proper order, 
thus 
Punch in col. 2 e- 82 4 @&2& 24 Bid -O°aaewe 


Cards la 1b le 2a 2b 2c 3a 3b 3c 4a 4b 4c 
5a 5b 5c 6a 6b etc. 


To start the solution, the first approximations are calculated (Table 
4, column 1). The cards are sorted on col. 1 and all cards 1-4 removed, 
since they are not needed in approximating to the d’s. Cards 5-8 are 
sorted on col. 2 and picked by hand to give the correct multipliers. 
Reference to the equations marked A above shows that the numbers 
of cards required are 


5a 5b 5c 6a 6b 6c 7a 7b Te 8a 8b 8 
> 8 8 @ 84 @ f2. #&  *@ 24.18 oa 


these numbers are simply the first approximations to the l’s. The re- 
maining cards are sorted on col. 2 and hand-picked in their turn. 

The cards picked out are now tabulated. Cols. 4-5, 6-10, 11-15, 
16-20, 21-25 are plugged to the counters and by controlling on col. 1, 
cols. 4-5 are totalled at the end of each variable to check that the 
hand-picking has been done correctly. The other counters total at the 
end of the run, and the printed record shows 





1 This feature allows the punching of numbers from 0-29 in one column, the “tens” and “twenties” 
being overpunched X and FY respectively. 
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10 
157 
126 
180 185106 172230 166236 186216 


which on subtraction from the right-hand sides gives the second ap- 
proximations found at A. 

The multipliers for the d-variable cards are now known so these are 
sorted and hand-picked. The J-variable cards are not needed at the 
next stage and are removed from the pack. The tabulator counters are 
replugged to cols. 4-5, 26-30, 31-35, 36-40, 41-45 and the following 
tabulation gives : 

208 
198 
269 
180 281800 282375 303750 266086 


which leads to the approximations found at B. The rest of the solution 
continues on the lines detailed in the previous section. 

A slight modification is possible which reduces the effect of rounding 
off the coefficients of the original equations. The coefficients are 
punched in the a pack to 5 decimals, and are reduced to 4 and 3 deci- 
mals for the b and c packs. If these packs are produced mechanically 
on a reproducer, the reduction is made without rounding off (compare 
[4], pp. 162-3). 


THE METHOD IN PRACTICE 


The method has been used in the analysis of two years’ results from 
a survey of maincrop potatoes [2]. There were 30 and 28 constants re- 
spectively representing 6 classifications of which the largest contained 
11 categories. In each case, 3-figure accuracy was attained after 4 cycles 
of the iteration. When some experience had been gained, each cycle 
took about 45 minutes to complete. Preparation of the working pack 
took about 4 hours using a reproducer; this was reduced to little more 
than 1 hour when a summary punch became available, so that the 
equivalent of Table 3 could be punched at the same time as Table | 
was being produced on the tabulator. The complete solution thus took 
about 5-8 hours working time. The same iteration carried out on desk 
machines took about 4 hours for each cycle. 

Mistakes in hand-picking were rare, but it was found worth while to 
use coloured cards for the a cards representing the first figures of each 
multiplier. 

Some difficulty may be caused by highly correlated variables. If two 
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such variables are present (that is, if one of the non-diagonal coefficients 
is near to 1) the corrections tend to pass backwards and forwards be- 
tween them showing only slow convergence to zero. There is no par- 
ticular point in continuing the iteration for the sake of these variables 
only, as they will in fact be ill-determined and high precision in the 
solution will only be misleading. 

It is known that for normal equations the process described above 
always converges. Convergence may be slow, however, and Aitken has 
described a technique for accelerating it [1]. Its application is made_ 
awkward here by the fact that the corrections are adjusted at each 
stage. In the two large examples so far attempted, it has been quicker 
to run another cycle or two of the iteration, but in the other cases Ait- 
ken’s method may be useful. His iteration is not quite the one used 
above, but the practical differences are trivial. 

We are indebted to Dr. F. Yates for the original suggestion which 
led to the method set out in this paper. 


SUMMARY 


A method is described for fitting constants to survey data by least 
squares, using a Hollerith sorter and tabulator. 
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THE USE OF RUNS TO CONTROL THE MEAN IN 
QUALITY CONTROL 


H. WEILER 
University of Technology, Sydney, Australia 


For quality control charts controlling the mean of a normal 
population, either small samples are taken out at frequent 
intervals or large samples at less frequent intervals. It will be 
shown that in order to detect small changes of the population 
mean, the amount of inspection is greatly reduced by the 
selection of large samples. However, if for other reasons small 
samples are desirable, a control by runs of sample means 
above or below certain control limits makes it possible to use 
small sample$ and yet maintain the advantage of a reduced 
amount of inspection. For certain types of runs, the sample 
size n =1 turns out to be very economical, so that time saving 
methods of control by gauging may be introduced without 
appreciable loss of efficiency. 


1. INTRODUCTION 


ONSIDER & variate x representing some measure of a mass-produced 
article, and suppose that the production has been brought under 
control. It may then go out of control in three different ways: 

(a) The mean of the population may change, which could happen, for instances 
when a tool setting gets out of position or when a tool wears out. 

(b) The standard deviation of the population may change, which could happen, 
for instance, when a fixed tool becomes loose. 

(c) The population may cease to be homogeneous, that is, elements may appear 
that do not belong to the original population. This may happen, for in- 
stance, when articles are produced by several machines; if one machine 
develops a fault, articles from that machine may be out of control while 
all other articles remain unaffected. 


While for a check on faults of type (a) a control chart controlling 
the mean of the population is most suitable, a control chart for stand- 
ard deviations or ranges is used to check on faults of type (b). For 
faults of type (c) both charts are useful, but it is essential that articles 
be selected from rational subgroups [1, 2]. For instance, if the same 
type of articles are produced by each of several machines, articles may 
be selected from each machine separately in order to allow discrimina- 
tion between the various machines. 

The usual control chart controlling the mean of a population is con- 
structed in the following way: After the mean and standard deviation 
of the population have been reliably estimated, samples of fixed size n 
are selected and their arithmetic means =)_2z/n are calculated. A 
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chart is then constructed with control limits m+ Bic/+/n, where m and 
sare the estimates of the population mean and standard deviation, and 
B,=3 or 3.09. The various values Z are entered in the chart in chrono- 
logical order, and as soon as one such value falls outside the control 
limits, production is stopped to allow investigation. 

In this paper, we shall investigate the following alternative control 
method. Instead of stopping the production when a single < value falls 
outside the control limits m+ B,c/+/n, we may calculate a pair of nar- 
rower limits m+ B2o/+/n and stop production as soon as two successive 
z values fall above the upper or below the lower of these limits. More 
generally, we may calculate a pair of limits m+ Byo/ /n such that we 
may stop production as soon as ) successive values fall above the up- 
per or below the lower of these limits. In each case, B, is determined 
such that if the population mean does not change, an average of 1000 
samples is necessary to produce one run of \ successive Z values above 
the upper (or below the lower) control limit. Thus, in each case, a false 
alarm can be expected about once in every 500 samples tested. On the 
other hand, if the population mean does change, the amount of inspec- 
tion required to detect a given change will depend on \ and n. 

It has been shown in a previous paper [3] that for \=1 the most eco- 
nomical sample size (that is, that value of n which would lead to the 
detection of a given change of the mean after a minimum of inspection) 
is much larger than the sample sizes usually used in quality control. 
Nevertheless, since small samples lend themselves readily to the detec- 
tion of faults of type (c), the quality control engineer may be reluctant 
to abandon them in favor of larger samples. It will be shown that for 
a check on faults of type (a) the use of runs makes it possible to retain 
small samples without an appreciable loss of efficiency. 

With the exception of a paper by Olmstead [4], in which runs are 
terminated whenever an observation turns out to be one of a specified 
kind, recent publications on runs deal mainly with runs within samples 
of fixed size [5, 6]. In particular, the theory has been applied to prob- 
lems of quality control in the form of runs above the sample median 
[7, 8], and runs up and down [9, 10, 11]. The runs in this paper differ 
from those of the other publications in that they are not related to a 
fixed number of observations. They constitute a test similar to a se- 
quential test [12], where the number of observations is not predeter- 
mined but depends on the outcome of the observations themselves. 
Although little mathematical research seems to have been done in this 
field, the method has been used intuitively by quality control engineers 
(13, 14]. 
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2, DETERMINATION OF CONTROL LIMITS FOR RUNS 
Definition 


Consider a sequence of trials where each trial may or may not pro- 
duce an event E. If a particular trial produces the event E, we shall 
call it a success. A sequence of \ consecutive successes not preceded by 
a success is called a run of X successes. 

We shall make use of the following theorem [15]. 


Theorem 


If p is the probability that a single random trial results in a success, 
then the expected number s of independent trials required to obtain a run of 
d successes is s=(1—p*)/p(1—p). 

Using this theorem, we may solve the following problem. 


Problem 


Let x be a normal variate of mean m and standard deviation o, and 
let €=>_2/n be the arithmetic mean of a sample of n independent z 
values. Every Z is called a trial and every 22m+Ba//n a success, 
Determine B such that in the average, 1000 trials are required to obtain one 
run of X successes. 

Let p be the probability that a random trial z gives 22m+Bo/V/n. 
Solving the above problem for \=1, we have s=1/p= 1000, or p=0.01. 
Since Z is normally distributed with mean m and standard deviation 
o/+/n, we obtain B=3.09 from a set of normal probability tables. 
Thus, if the @ values are entered in a control chart with control limits 
m+3.09c/+/n, and if the population mean and standard deviation re- 
main unchanged, we can expect that an average of 1000 trials will be 
required to obtain one trial above the upper control limit. Similarly, an 
average of 1000 trials will be required to obtain one trial below the lower 
control limit, so that an average of 500 trials can be expected to pass, 
before a “false alarm” or type I error [5] occurs. 

In a similar manner, we may solve the problem for \=2. This gives 
s=(1+ p)/p?=y’?+y=1000, where y=1/p. Solving this equation, we 
obtain y=31.127 and p=0.03213. The normal probability tables 
give B=1.85. Thus, if Z is entered in a chart with control limits m+ 
1.850//n, and if two successive ~ values above the upper or below the 
lower limit are regarded as significant, we can again expect that 500 
trials will pass before a false alarm occurs. 

For \=3, the equation reduces to s=y+y?+y*=1000, which may 
be solved by any numerical method. Using Newton’s method, we find 
easily y= 9.645 and p=0.10368, and we deduce B=1.26. 
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In this way, we calculate B for \=1, 2, 3, - - - , 9, and obtain the fol- 
lowing values. 


TABLE I 
CONTROL LIMIT FACTORS FOR A=1, 2,--- 








2 3 4 5 6 Y i 





31.127 9.645 5.341 3.742 2.953 2.494 
-0321 - 1037 - 1873 . 2672 - 3356 -4010 
1.05 1.26 0.89 0.26 0.42 0.25 





In each case, if we regard a run of \ values above the upper or below 
the lower control limit as significant, we can expect an average of 500 
trials to pass, before a type I error is committed. 


3. THE AVERAGE AMOUNT OF INSPECTION FOR A GIVEN CHANGE 
OF THE POPULATION MEAN 


let x be a normal variate and suppose that the control limits 
m+Bo/+/n are adopted for the arithmetic mean = )2/n. If the pop- 
ulation mean changes from n= m to n= m-+ko(k>0) while the standard 
deviation « remains constant, the probability that Z exceeds the upper 
control limit is (see also [3]): 


P = Pr {4 > m+ Bo/Vn| un = m+ ko} 
= Pr FP 2B evil n= mt be} (1) 
a/Vn - 
= Pr {z>B—kvn}, 





where z is the standardized normal variate (mean zero, standard devia- 
tion one). 

If we regard a run of \ values above the upper control limit as sig- 
nificant, we shall in the average require S=(1—P*)/P\(1—P) samples 
to detect a change of the mean from p=m to hn=m-+ko. The corre- 
sponding number of articles to be tested is 


n(1 — P*) 


= Pap)’ (2) 


A(n) 


—* f — =" 


/2r 


B-kVn 
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The value of n for which A(n) is a minimum may be found by soly- 
ing the equation dA/dn=0. This has been done in [3] for \=1 and can 
also be done for \=2, 3, - - - , but a direct calculation of A(n) for vari- 
ous values of n is less tedious and more instructive. 


4 





z * * ’ +e ad sa ‘s 


Fia. 1. Average Amount of Inspection for \ =1; B =3.09; n =1, 5, 10, 20, 40. 


4. GRAPHICAL REPRESENTATION AND DISCUSSION 


Equations (2) and (3) show that when n, \, and B are given, the av- 
erage amount of inspection A(n) is a function of k, which can be read- 
ily calculated with the help of a set of normal tables. The variation of 
A(n) as a function of k is shown in Figures 1, 2, 3, 4, for various values 
of n and X. Since A(n) increases rapidly with decreasing k, the use of 
semi-logarithmic paper was found to be convenient. 

It may be seen from Figure 1 that with the conventional control 
chart (A=1), small samples usually require a much greater amount of 
inspection than large samples. In particular, the sample size n=5 is 
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economical only when the population mean changes by more than (say) 
one standard deviation. The sample size n = 1 is particularly uneconom- 
ical unless k is very large. 

Figure 2 gives the average amount of inspection required when two 
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Fic. 2. Average Amount of Inspection for \=2; B =1.85; n =1, 5, 10, 20. 


successive values above the upper (or below the lower) control limit 
are regarded as significant. It shows that here the amount of inspec- 
tion by means of small samples is greatly reduced. In particular, for 
k=0-4 and sample size n=5, the average amount of inspection is 380 
for the conventional chart and only 210 for the chart with \=2. The 
sample size n= 1, although still uneconomical, is more economical than 
with the conventional chart. 

Figure 3 shows that a chart with \=3 represents a further improve- 
ment for small samples, while large samples become uneconomical. 
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Figure 4 shows that for \=6 the sample size n=1 is very economi- 
cal. This is an important result, because the high efficiency of a control 
by runs of individual values makes it possible to use gauges where oth- 
erwise costly measurements are required. The loss of efficiency that 
testing by gauges usually entails is here avoided. 
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Fic. 3. Average Amount of Inspection for \=3; B =1.26; n =1, 5, 10, 20. 


A similar graph for \=9, B=0 would show that the efficiency of 
sample size n=1 is about the same as for a chart with \=6. The ad- 
vantage of taking \=9 rather than \=6 would be that B is equal to 
zero so that no control limits need to be calculated. 


5. THE CHOICE OF THE MOST SUITABLE CONTROL CHART 


Since for any given value of B, the probability P defined by equation 
(3) is a function of k+/n alone, the expression 


ke/n)*(1 — P» 
KA(n) = $ = (4) 
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Fia. 4. Average Amount of Inspection for \=6; B =0.42; n=1, 5, 10. 
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remains constant as long as k+/n and ) are kept fixed. For any given 
value of }, this expression is thus a function of the one variable k/n, 
It is easy to calculate the values of the function for any values of k+/n 
and to plot the corresponding curve. This has been done in Figure 5 for 
A=1, 2, 3, 6, 9. The curves show clearly that the conventional chart, 
based on \=1, is economical only when k+/n is greater than (say) 2.5. 

This means that the conventional chart is most efficient in a range 
that is usually of little interest. If, for instance, the sample size n=4 
is used, the conventional chart is efficient only when the population 
mean changes by more than 1.3 standard deviations. A chart with 
\=2, on the other hand, would then be very efficient for changes of 
between 0.8 and 1.5 standard deviations and is superior to a chart 
with \=1 for any change up to 1.4 standard deviations. The saving of 
inspection may be anything up to 40%. 

The saving is even greater (up to 50%) when A=3 is used, but the 
range for which such a chart is most efficient is somewhat reduced. 
When \=6 is used, the saving may be anything up to 60%. However, 
the range of high efficiency is further reduced and the chart becomes 
rather inefficient when k+/n>2. 

The case \=9 is of special interest, because B is equal to zero. This 
means that no control limits need to be drawn. A chart then becomes 
unnecessary, and gauging methods may be adopted when the sample 
size n=1 is adopted. Moreover, any of the above charts may be com- 
bined with the observation of individual articles. Production should be 
stopped to allow investigation as soon as \ successive £ values fall 
above the upper or below the lower limit, or when 9 successive single 
values fall above or below the population mean. 


6. CONCLUSION 


It has been demonstrated that the sequential use of runs for control 
charts controlling the mean leads to great saving of inspection, and 
that it will, in many cases, be of advantage to introduce it instead of 
the conventional chart. The conventional chart is to be preferred only 
when large samples are not a disadvantage or when sequential methods 
are undesirable. 
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TRUNCATED POISSON DISTRIBUTIONS 


Paut R. River 
Wright-Patterson Air Force Base and Washington University 


This paper gives a method of estimating the parameter of 
a Poisson distribution which has been truncated at the lower 
end. Application is made to a number of actual examples. 


INTRODUCTION 


ANY studies have been made of truncated distributions. (See [2] 
M and the references contained therein.) Of the continuous type, 
the normal distribution and the Pearson system of distributions have 
been rather thoroughly investigated. Of discrete distributions, the bi- 
nomial has been studied by Finney [8]. 

Yule [6] has considered an interesting type of distribution which he 
met in studying vocabulary. This is the number of words occurring 
once, the number occurring twice, and so on, in a specified work of a 
certain author. The distribution is somewhat similar to a truncated 
discrete distribution, in that there is no frequency corresponding to 
the number of words occurring zero times. Obviously there can be no 
frequency corresponding to the zero class unless it can be assumed that 
the total number of words in the author’s vocabulary is known. The 
frequency of the zero class would then be those words in his vocabulary 
which were not used in the particular work under consideration. 

Other examples of truncated discrete distributions can easily be 
thought of. Consider, for example, the distribution of number of traf- 
fic violations. There will be certain persons who have received 1 ticket, 
some who have received 2 tickets, some 3, and so on. There will be no 
record of those who have received no tickets. 

The present paper considers another discrete distribution, the Pois- 
son. 

As is well known, the Poisson probability function is the function 


Dp: = er*/x! (1) 


This gives the limit, as the number of trials approaches infinity but 
the number of expected occurrences \ remains constant, of the proba- 
bility that an event will occur exactly z times, x ranging over the non- 
negative integers. 

The function contains the single parameter A, to which, incidentally, 
each and every semi-invariant of the Poisson distribution is equal. Tip- 
pett [5] and Bliss [1] have considered the question of estimating this 
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parameter when the frequencies of those classes corresponding to val- 
ues of x above a certain specified value have been pooled. Fisher and 
Yates [4], p. 1, have shown that for an even number of degrees of free- 
dom, the probability of exceeding a given value of x? is reducible to a 
partial sum of a Poisson series, i.e., a Poisson series with the upper end 
truncated. The present paper gives methods of estimating \ when some 
of the data in a sample are missing, particularly when the lower end is 
truncated. 


ESTIMATING THE PARAMETER FROM TWO CLASS FREQUENCIES 


If a sample is truly Poisson in character, the value of \ can be esti- 
mated even when only two different class frequencies are known. Let 
us designate by f. the frequency with which the value z occurs in the 
sample. Then the expected value of f, is Np.z, where N is the number 
in the sample. If we use the observed frequencies of two different 
classes as estimates of their expected values, we are led to the equation 

fe  mide-™ 


fm iki 





(2) 


which is easily solved for X. 


ESTIMATING THE PARAMETER FROM A TRUNCATED SAMPLE 


We wish now to consider the case in which one or more classes at the 
lower end of the sample are missing. We shall use the following nota- 
tion: 


To - Life T, = Xu rf, T2 = Xu xfs, (3) 


where k is the number of missing classes. Further, let 


k—-1 
To’ = N>> pz + To, 
0 
k—-1 
T;'=N } tpz + Ti, (4) 
0 
k-1 


T:' = N >. xp, + To. 
0 


Then 7,,'/7>’ is an estimate of the mean X, and similarly,7.’/7’ is an 
estimate of the second moment of the distribution about the origin, 
viz., A-+A?2, 
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We therefore set 
T,’ = Ty’, T.’ = (A+ A*)T’. (5) 


Substituting from (4) and (1), we are led, after some reduction, to the 
following equations: 





r wr Ne» . 
;  @-t (6) 
Ne™nr* 
T2 — (A+ A*2)T. = ——_ (F +d). 7 
QO +e = GF +) (7 
Solving these simultaneous equations for A, we get 
T: — kT. 
ee creme —_ (8) 
T, — (kK —1)To 


When ) has been estimated from (8), all missing f, can be estimated, 
as can the total frequency. 


EMPIRICAL SAMPLING 


As a test of how good an estimate of \ is provided by (8), samples 
of size 100 were drawn, by using random numbers, from populations of 
10,000, conforming as closely as possible to Poisson distributions. The 
following values of \ were used: 0.5, 1, 2, 3, 4, 5. These samples are 
shown in Table 2. The first column gives the values of xz. The second 
column, headed \=0.5, gives the frequencies, for the respective values 
of x, in the sample drawn from the Poisson population in which the 
parameter \ has the value 0.5; the column headed \=1 gives the fre- 
quencies in the sample drawn from the population in which the param- 
eter has the value 1, and so on. 

We shall denote by X’ the estimate of \ obtained from (8) with k=1, 
and by ’” the estimate of \ obtained from (8) with k=2. Values of }’ 
and ” are recorded in Table 2. 


COMPARISON WITH MAXIMUM LIKELIHOOD ESTIMATES 


It can be shown that the maximum likelihood estimate of \ is given 
by the solution of the equation 





yp) = ‘ (9) 











TRUNCATED POISSON DISTRIBUTIONS 
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where »;“) is the first moment of the truncated sample. In particular, 










































































we have 
nN A(1 — e) 
y\4 = ’ y,"! ( . (10) 
1 —e- 1 — e~* — dre 
TABLE 1 
Xx ny! yy” ny! yy” x mn! y,/" 
0.1 | 1.051 | 2.034 || 0.9 | 1.517 | 2.347 || 1.7] 2.080 | 2.742 
0.2 | 1.103 | 2.069 1.0 | 1.582 | 2.392 || 1.8] 2.156 | 2.797 
0.3 | 1.157 | 2.105 1.1 | 1.649 | 2.488 || 1.9] 2.234 | 2.854 
0.4 | 1.213 | 2.142 1.2 | 1.717 | 2.486 || 2.0] 2.313 | 2.911 
0.5 | 1.271 | 2.181 1.3 | 1.787 | 2.534 || 2.5 | 2.724 | 3.220 
0.6 | 1.330 | 2.221 1.4 | 1.858 | 2.584 || 3.0] 3.157 | 3.560 
0.7 | 1.391 | 2.262 1.5 | 1.931 | 2.635 || 4.0] 4.075 | 4.323 
0.8 | 1.453 | 2.304 1.6 | 2.005 | 2.688 || 5.0 | 5.034 | 5.176 
TABLE 2 
SAMPLES FROM POISSON DISTRIBUTIONS 

zr \=0.5 A=] h=2 A=3 A=4 A=5 
0 54 47 20 6 4 0 

1 32 26 31 7 9 7 

2 12 14 25 32 21 5 

3 2 9 10 14 20 10 

4 4 10 19 17 28 

5 3 15 10 11 

6 1 4 8 16 

7 2 8 14 

s 1 2 6 

9 1 2 
10 1 
PU 0.58 1.34 1.86 3.02 3.70 4.68 
n” 0.38 1.34 1.95 2.93 3.73 4.65 
vy’ 1.35 1.83 2.15 3.30 3.73 4.84 
id 2.14 2.63 2.88 3.48 4.01 5.13 
i’ 0.63 1.36 1.79 2.71 3.62 4.80 
i” 0.40 1.49 1.94 2.89 3.59 4.94 
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To assist in solving these equations, values were assigned to d, and 
the corresponding values of »;’ and v;’ were calculated. Results are 
shown in Table 1. From this table, for a given value of 1)’ or »,’’, the 
maximum likelihood estimate, }’ or X’’, of the parameter \ can be ob- 
tained by interpolation. The values for the samples obtained in this 
study are exhibited in Table 2. 


CONCLUSION 


As judged by the limited number of samples in this study, the esti- 
mates of \ provided by the suggested method seem in most cases to be 
somewhat better than those provided by the method of maximum like- 
lihood. This is particularly true when only the lowest class is missing. 
Moreover, the method is quite simple and direct, while the method of 
maximum likelihood requires the solution of equation (9) either by 
trial and error of by the use or tables similar to Table 1. 
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ADDENDUM 


Attention should be called to a paper by F. N. David and N. L. 
Johnson, “The Truncated Poisson,” Biometrics, 8 (1952), 275-85, 
which appeared after my paper was submitted for publication. The 
authors consider the special case k=1 of the estimator which I have 
proposed. They show that it has an efficiency less than 1. The efficiency 
has a minimum value of about 70%, which occurs in the range \=2.5 
to \=3.0, and approaches 100% with increasing X. 

















PERCENTAGE POINTS OF THE INCOMPLETE 
BETA FUNCTION 


Rosert E. CLrarx 
The Pennsylvania State College 


HE table presented here gives to four significant figures the values 
Ts p=P(N, X, a) defined by 


N 
oe a(\)va — p)*+ = 1,(X,N —-X +1), 
reX \T 
for N =10(1)50, X =1(1)N, and a=.005, .010, .025, and .050, where 
I,(X, Y) is Karl Pearson’s incomplete beta function ratio.' Values of 
p for which I,(X, Y) =.005, 010, .025, .05, .10, .25, .50 have been given 
by Thompson? in terms of the arguments »,=2Y and v2=2X. The en- 
tries in the present table were obtained by inverse linear interpolation 
of the logarithms of the accumulated frequencies found in Pearson’s 
Tables of the Incomplete Beta Function and by interpolation with 
Lagrangian coefficients of Thompson’s percentage points of the incom- 
plete beta function. For values of p >.2000 these two methods of inter- 
polation gave results which agreed within two units in the fourth 
significant figure, in spite of the fact that for v.>30 in Thompson’s 
tables double interpolation was employed. Thompson’s tables for 
p <.2000 were worked twice to insure accuracy, and then were accepted 
as accurate. For values of p>.2000 the data were smoothed by taking 
fourth differences, staying within the limits set by these two methods of 
interpolation. The data are therefore felt to be accurate within +1.5 
in the last significant figure. 
Since binomial sums and the incomplete Beta function appear fre- 
quently in statistics the table may be used in a number of problems: 
1. Confidence limits for binomial variates may be obtained directly from the 
table. 
2. The table gives some percentage points for the incomplete Beta function 
which are not given by Thompson.? 
3. The values in the table are the lower percentage points of all order statis- 
tics in samples of size from 10 to 50 inclusive. From these values it is possi- 


ble to obtain the corresponding percentage points of any continuous dis- 
tribution by the method given by Curtiss.* 





1 Tables of the Incomplete Beta Function, edited by Karl Pearson, Biometrika Office, University 
College, London, W.C. 1. 

2 Catherine M. Thompson: “Percentage points of the incomplete beta function,” Biometrika, 32 
(1941) 168-81. 

*J. H. Curtiss: “Convergent sequences of probability distributions,” American Mathematical 
Monthly, 50 (1943) 103-5. 
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4. The .05 column of the table is an extension of a part of the table given by 


Grubbs.‘ 


5. Percentage points of the F distribution for a=.005, .010, .025, .050; 
nm, =2(2)100, n2=2(2)100 for n1+nz—23100 may be obtained from the 
present table by making the transformation® 


F = 


n2(1 — p) 


mp 


where p = P[}(ni1+n2—2), $n2, a]. For example F.o for ni =10, n2=20 is 
2.35 since p= P(14, 10, .05) =.46 


These are only a few of the applications which may be made of the 
table which is presented here. The reader will probably know of others 


TABLE 1 


PERCENTAGE POINTS OF THE INCOMPLETE BETA FUNCTION 
(times 10,000) 











N x 005 -010 -025 -050 N xX -005 -010 -025 -050 
10 1 5.011 10.05 25.29 651.16 7 2085 2349 2767 3152 
2 108.5 155.4 252.1 367.7 8 2725 3024 3489 3909 
3 370.1 475.1 667.4 872.6 9 3448 3778 4281 4727 
4 767.7 932.1 1216 1500 10 4270 4627 5159 5619 
5 1283 1504 1871 2224 11 5230 5605 6152 6613 
6 1909 2183 2624 3035 12 6431 6813 7354 7791 
7 2649 2971 3476 3934 
8 3518 3883 4439 4931 13 1 3.855 7.728 19.46 39.38 
9 4557 4956 5550 6058 2 82.52 118.2 192.1 280.5 
10 5887 6310 6915 7411 3 278.3 357.8 503.8 660.5 
4 570.8 694.6 909.2 1127 
11 1 4.556 9.133 22.99 46.52 5 942.3 1108 1386 1657 
2 98.20 140.7 228.3 333.2 6 1383 1588 1922 2239 
3 333.4 428.2 602.2 788.2 7 1887 2129 2513 2870 
4 688.4 836.6 1093 1351 8 2454 2729 3158 3548 
5 1145 1344 1675 1996 9 3087 3391 3857 4274 
6 1693 1940 2338 2712 10 3794 4122 4619 5054 
7 2332 2622 3079 3498 11 4590 4939 5455 5899 
8 3067 3396 3903 4356 12 5510 5872 6397 6837 
9 3915 4277 4822 5299 13 6653 7017 7530 7942 
10 4914 5302 5872 6356 
11 6178 6579 7151 7616 14 1 3.580 7.176 18.07 36.57 
2 76.42 109.5 178.0 260.0 
12 1 4.176 8.372 21.08 42.65 3 257.1 330.6 465.8 611.0 
2 89.68 128.5 208.6 304.6 4 525.9 640.3 838.9 1041 
3 303.4 389.8 548.6 718.7 5 866.0 1019 1276 1527 
4 624.0 759.0 992.5 1229 6 1267 1457 1766 2061 
5 1034 1215 1517 1810 7 1724 1947 2304 2636 
6 1522 1746 2109 2453 8 2234 2488 2886 3250 





4 Frank E. Grubbs, “On designing single sampling inspection plans,” Annals of Mathematical Sta- 
tistics, 20 (1949), 242-56. 
§ Maxine Merrington and Catherine M. Thompson: “Tables of percentage points of the inverted 
beta (F) distribution,” Biometrika, 33 (1943), 73-88; and C. J. Burke: “Computation of the levels of 


significance in the F test,” Psychological Bulletin. 48 (1951) 392-97. 
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TABLE 1—(cont.) 
by 
- yn X  .005 010 .025 050 N X 005 010 .025 050 
’ 
he 9 2799 3080 3514 3904 16 © 6370«/«6684=— 71317499 
10 3421 3726 «©4190-4600 17 -7322,—S« 7627S 80478384 
11 = 4108S 4433—« «4920-5343 
12 4877 «=«—«5217,——s«B719——s«C 4G 18 1 2.784 5.582 14.06 28.46 
13-5760, «6109S 6137083 2 58.99 84.57 137.5 201.1 
14 = 6849——s«s7107/—«s«7684=— (8074 3 197.0 253.6 357.9 470.2 
is 4 400.2 488.0 640.9 796.9 
150 i(‘<aLMSsi“‘é‘<« 68 | 16.86 | 84.14 5 = 654.4 771.9 969.5 1164 
2 71.17 102.0 165.8 242.3 6 950.7 1006 1334 1568 
he 3 238.9 307.2 433.1 568.5 7 1284 145417301989 
4 487.6 593.9 778.7 966.6 8 165018442153 2440 
rs 5 801.1 943.6 1182 1417 9 2046 «= 2263 2602 2912 
6 1170 ~=«1346Ss«1634~=—«1909 10 ©2474 «= 2710» 3076 = (3406 
7 1587 1795 2127) 2437 ae 2982 3186 3575 3922 
8 2051 2287 2659 3000 12 3421 3691 4099 4460 
N 9 2561 2823 3229 3596 13 3945 4228 4652 5022 
= 12 4395—«—is«AT1S—— BIDS (5:G02 16 = «5783 8088 65296897 
) 13 5137-5468 = «5954 = 6366 17 6537 6840 7271 7623 
iad 14 5984 6321 6805 7206 18 7450 7743 $147 8467 
2 15 7024 «= 7356 = 7820-8190 
. 19 «=1 = - 2.688 «5.288 =—«13.32 26.96 
: 16 «= sAs«S.132 6.280) 15.81 82.01 : pre ones oo : pe 
; : a oy ar 4 377.7 460.6 605.2 752.9 
, 4 454.5 553.8 726.6 902.5 ; a ——- a ae 
; = = = a 1801888 1020 1875 
6 1086 1251 1520 1 
, 7 1471 «= '1665 19752267 . . a. a a 
, 8 1807 2117 2465 2786 
: 9 2362 2607 «2988-3334 a a Ba me 
' sons Ss = = 123191 3447 38364181 
1k «3415 35701 41344517 3 4th: ti«CTO 
; sm a cs ms 14-4182 4462 «48805242 
so oe om os 15 «4729S 50184445809 
. 14 53725605 C165 «562 ‘ose 6 (lems 
‘ 15 «6186. «6512, «6977-7360 So - «- #- 
‘ 16 = 7181. «7499S 7941 8298 18 essiéatsi‘ié‘éOT?:OC*«*CTTORG 
19 7567 7848 8235 8541 
17 -1—s«2.948 5.910 14.88 30.13 
} 2 62.56 89.67 145.8 213.2 20 1 2.506 5.024 12.65 25.61 
; 3 209.2 260.2 «379.9 499.0 2 52.95 75.92 123.5 180.7 
) 4 425.6 518.8 681.1 846.4 3 176.4 «227.1 320.7 421.7 
5 697.0 821.7 1031 1238 4 357.6 436.2 573.3 713.5 
6 1014 1168 1421 1664 5 583.3 688.4 865.7 1041 
7 1371 1552 1844 2119 6 845.5 975.4 1189 1395 
; 8 1764 1971 2208 2601 7 139 1292 ©1539 «1778 
) 9 2193 2423 2781 3108 8 1460 1634 1912 2171 
* 10 «2656 «2006 «= 32083640 9 1806 2001 2306 2586 
11 = 3154-3423, 38334197 10 ©2177, S390») 27203020 
= 12 3690-3075 44044781 11 =—s-:2572 2801S 31523469 
13 4268 «= 4566 «= 50105395 12-2001 3234 = 36053956 
- 14 + 4806S s«5204 «= «5657 (044 13-3434. 3691 «= «4078 = 4420 
of 15 «5587 «= «B01. « «63566738 14 3004S 4171S 4572 4922 
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TABLE 1—(cont.) 





N x -005 -010 025 -050 N x -005 -010 025 -050 








15 4402 4679 5090 5444 5 501.7 592.5 746.0 898.1 
16 4934 5217 5634 5990 6 725.3 837.5 1023 1202 
17 5505 5793 6211 6563 7 974.3 1107 1321 1525 
18 6129 6417 6830 7174 8 1246 1397 1638 1863 





19 6829 7112 7513 7839 9 1537 1705 1971 2216 
20 7673 7943 8316 8609 10 1848 2031 2319 2582 
11 2176 2374 2682 2961 
21 1 2.387 4.785 12.05 24.40 12 2521 2733 3059 3352 
2 50.37. 72.22 «117.5 = 171.9 13 2884 3108 3450 3754 
3 167.7 215.9 304.9 401.0 14 3264 3499 3854 4169 
4 339.5 414.2 644.6 678.1 15 3662 3906 4274 4596 
5 553.3 653.2 821.8 988.5 16 4079 4331 4708 5036 
6 801.2 924.7 1128 1324 17 4517 4776 5160 5490 
7 1078 1224 1459 1682 18 4978 5242 56306 5961 
8 1381 1546 1811 2057 19 5466 5733 6123 6451 
9 1707 1891 2182 2450 20 5988 6256 6641 6964 
10 2055 2257 2571 2858 21 6554 6819 7196 7507 
ll 2425 2642 2978 3281 22 7186 7443 7805 8098 
12 2815 3047 2402 3719 23 7942 8185 8518 8779 
13 3228 3472 3844 4172 
14 3663 3819 4303 4641 24 1 2.088 4.187 10.54 21.35 
15 4122 4387 4783 5126 2 43.95 63.03 102.6 150.1 
16 4608 4880 5283 5630 3 145.9 187.9 265.6 349.5 
17 5124 5402 5809 6156 4 204.7 359.8 473.5 590.1 
18 5678 5959 6366 6708 5 479.3 566.2 713.2 858.8 
19 6282 6561 6962 7294 6 692.5 799.9 977.3 1149 
20 6957 7232 7618 7933 7 929.7 1056 1262 1457 
21 7770 8031 8389 8671 8 1188 1332 1563 1780 
9 1465 1626 1880 2116 
22 1 2.278 4.567 11.50 23.29 10 1759 1935 2211 2464 
2 48.03 68.88 112.0 104.0 11 2070 2260 2555 2824 
3 159.7 205.7 290.6 382.2 12 2396 2599 2912 3194 
4 323.1 394.3 518.7 646.0 13 2738 2953 3282 3576 
5 526.2 621.4 782.1 941.1 14 3096 3322 3664 3968 
6 761.3 878.9 1073 1260 15 3470 3705 4059 4371 
7 1024 11€2 1386 1599 16 3860 4104 4468 4736 
8 1310 1468 1720 1956 17 4268 4519 4891 5213 
9 1618 1793 2071 2327 18 4696 4952 5329 5653 
10 1946 2138 2439 2713 19 5145 5405 5785 6109 
li 2293 2501 2822 3113 20 5621 5882 6262 6582 
12 2660 2881 3221 3526 21 6127 6538 6764 7078 
13 3046 3280 3636 3952 22 6676 6934 7300 7602 
14 3451 3696 4066 4391 23 7287 7539 7887 8171 
15 3877 4132 4513 4846 24 8019 8254 8575 8827 
16 4326 4588 4978 5315 
17 4799 5068 5463 5802 25 1 2.005 4.019 10.12 20.50 
18 5301 5574 5972 6309 2 42.16 60.46 98.39 144.0 
19 5839 6113 6509 6841 3 139.9 180.2 254.7 335.2 
20 6423 6695 7084 7405 4 282.3 344.7 453.8 565.6 
21 7076 7342 7716 8019 5 458.9 6542.2 683.1 822.9 
22 7860 8111 8456 8727 6 662.6 765.5 935.6 1101 
7 888.9 1010 1207 1395 
23 1 2.179 4.369 11.00 22.28 8 1135 1273 1495 1703 
2 45.90 65.82 107.1 156.7 9 1399 1553 1797 2024 
3 152.5 196.4 277.5 365.2 10 1679 1848 2113 2356 
4 308.3 376.3 495.1 616.8 11 1974 2156 2440 2699 














898.1 
1202 
1525 
1863 
2216 
2582 
2961 
3352 
3754 
4169 





POSS RS ine sees ecw 
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y X  .005 .010 .025 050 N X  .005 010 .025 050 
12 «2284S «2479 = 2780 = 3051 15 3002S «3213.«S 35333816 
13 © 2607,-—='s«2814 = 3131344 16 =. 3331.«S 3550 = 38804171 
14 «2046S 316334933786 17 +: 3672-S«« 3808 «= «4237 «4534 
15 3208 «= «3525 «= 38674168 18 4025 4257 «= «4604 «4905 
16 3665 «= «390042524561 19 4392 4630 4982-5286 
17 4048 «= 4289 4651 (4964 20 4774 «8016. «S«5372—S«5677 
18 4447 4604 «50625378 21 ««5173.—ss«BAI7-—Ss«5774=— 6079 
19 4864 5116 5487 5805 22 «= «B589s«83S C1924 
20 5302 5557 5930 6246 23 «© 6027,-«Ss«273«Ss«627 Ss 6924 
21 5765 6021 6392 6704 24 «©6493 «6736 «= 7084 «7378 
22 «257s 512——s«87B=OC7183 25 «= 6906S s«7234=S ss 7571=— 7847 
23 «©=-6700-«s—«s7041.=Ss«7397 = 7600 26 ©7554 «= «7783)=S«8103 «(8360 
24 «= 7382—«—s«=7625—=—i«7965;~—=C(‘é BD 27 = 8218. «8432S 8723 «8950 
25 8090 «8318 «= 8628 «8871 

2 «= 1s«.790 = 8.589 = 9.088 ~— «18.30 

2% 1 1.9298 3.865 9.733 19.71 2 37.57 53.88 87.70 128.4 
2 40.51 58.10 94.55 138.4 3 124.4 160.8 226.7 298.5 
3 134.3 173.0 244.6 322.0 4 250.7 306.2 403.4 503.1 
4 271.0 330.8 9435.6 543.1 5 406.8 480.9 606.4 731.1 
5 440.1 520.1 655.5 789.8 6 586.5 678.0 829.6 976.9 
6 635.1 733.9 897.4 1056 7 785.6 893.5 1069 1237 
7 851.6 960.8 1157 1338 8 1002 i125 1322 1509 
8 1087 1220 1433 1633 9 1232 1370 1588 1791 
9 1338 1487 1721 1940 10 1477 «1627 =S «1864 = 2082 
10 = 16051768 «= 2023S 2257 11 «1733 «1806. «21502383 
11 =—«-'1887, 206223352584 12 2002,-'ss«2176=— 2446 «2691 
12 «2181-2369 2659S 202 1322822466 «= 2751 «=: 3007 
13-2489 2688) «20933266 14 —«-:2572,-s«-2767 «= 30658331 
14 = 2809-8018 «S 8337S (3621 15 2874 «= 3078 «= 8387 = (3662 
15 «3143S: 3361=S 3602S 3084 16 ©3186. «= 3398 «= 3718 += 4000 
16 3490S 3716 = 40574357 17-3510 3729S 4058 = 4346 
17-3850 4084 = 44334739 18 ©3845 «= «4070 «=: 4407 = 4700 
18 42254465) 48215130 19 4192 4422« «4765 = 5062 
19 4615 4860 5222 5532 20 4552 «4786 = 5134 = 5433 
20 6023 5271 5636 5946 21 «4925 «= «5163S s«5513S« 813 
21 «5450S 5700» 6065374 22 «53135554 «5905 = 6208 
22 «© 5900s«G1S1 = «65136818 23 «= -B720.-s«B961=Ss 11S 6606 
23 «= «63796628 = 60857281 24 «= «6147,—««s«6387 = «6734 = 7028 
24 «= 680671417487) 7771 25 «= 6601.«««6838.—=s7177~=S 7459 
25 7471 7707 8036 8301 26 ©7089 «©7321«S 7650S 7918 
26 «© 8156 «Ss 8377 «s«8677-—=S «8912 27° ««-7631.=S«s«7854= «81658415 

28 «= 8276 «=: 8483 «8766 = 8985 

27 «1s«i1,856 3.722 «9.373 «18.98 
2 38.99 55.91 91.00 133.2 29 #1 1.728 3.465 8.726 17.67 
3 129.2 166.4 += 235.3 309.8 2 36.25 52.00 84.64 123.9 
4 260.4 318.0 418.9 522.3 3 120.0 «154.6 218.6 288.0 
5 422.8 499.7 630.0 759.3 4 241.7 295.2 389.0 485.2 
6 609.8 704.9 862.2 1015 5 302.0 463.4 584.6 704.9 

7 817.3 920.3 1111 1285 6 564.9 653.2 799.4 941.6 
8 1042 1170 1375 1568 7 756.4 860.4 1030 1192 
9 1283 1426 1652 1862 8 963.9 1083 1273 1453 
10 1588S «160519402168 9 1185 1318 1528 1725 
11 ©1807 Ss «:1976 += 2239S 2479 10 = 1420S 1565 «1794 = 2005 
12 2088S 2268 «= 2548 (801 11 1666 ~=— «1823 «2069 «2298 
13-2381 S572 «28673131 12 1923-2001 «23522589 
14 2686. « 2887 «= 3195-3470 13-2191 «2869 «2645 2893 
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TABLE 1—(cont.) 








N x 005 010 -025 .050 N x -005 -010 -025 -050 
14 2469 2657 2945 3203 9 1102 1225 1422 1606 
15 2757 2953 3253 3520 10 1318 1454 1668 1866 
16 3055 3259 3569 3844 11 1546 1693 1923 2134 
1? 3362 3574 3894 4175 12 1783 1940 2185 2408 
18 3680 3899 4226 4512 13 2029 2197 2455 2688 
19 4009 4233 4567 4857 14 2285 2461 2732 2975 
20 4349 4578 4917 5210 15 2549 2733 3015 3267 
21 4701 4934 5276 5571 16 2821 3014 3306 3566 
22 5067 5302 5646 5941 17 3102 3302 3603 3870 
23 5447 5683 6028 6320 18 3392 3598 3908 4180 
24 5843 6080 6423 6711 19 3691 3902 4219 4496 
25 6260 6495 6834 7116 20 3998 4214 4537 4818 
26 6702 6934 7265 7539 21 4315 4536 4863 5146 
27 7177 7404 7723 7985 22 4642 4866 5196 5481 
28 7704 7921 8224 8466 23 4980 5206 5539 5823 
29 8330 8532 8806 9019 24 5329 5557 5891 6174 

25 5692 5920 6253 6534 

30 1 1.671 3.349 8.436 17.08 26 6070 6298 6627 6904 

2 35.03 50.24 81.78 119.8 27 6467 6693 7017 7286 
3 115.9 149.3 211.2 278.2 28 6887 7109 7425 7685 
4 233.3 285.0 375.5 468.5 29 7338 7554 7858 8105 
5 378.2 447.2 564.2 680.5 30 7837 8043 8329 8559 
6 544.8 630.1 771.4 908.8 31 8429 8620 8878 9079 
7 729.2 829.7 993.4 1150 
8 928.9 1044 1228 1402 32 1 1.566 3.140 7.909 16.02 
9 1142 1270 1473 1663 2 32.81 47.06 76.61 112.2 
10 1367 1508 1729 1933 3 108.4 139.7 197.7 260.4 
11 1604 1755 1993 2211 4 218.2 «266.5 351.3 438.5 
12 1850 2013 2266 2495 5 353.4 418.0 527.5 636.5 
13 2107 2280 2546 2787 6 508.7 588.4 720.8 849.6 
14 2373 2555 2834 3085 7 680.4 774.4 927.7 1074 
15 2648 2839 3129 3389 8 866.1 973.4 1146 1309 
16 2933 3131 3433 3699 9 1064 1184 1375 1553 
17 3227 3432 3743 4016 10 1273 1404 1612 1804 
18 3530 3742 4060 4339 11 1492 1634 1857 2062 
19 3843 4060 4386 4669 12 1720 1873 2110 2326 
20 4166 4388 4719 5005 13 1957 2119 2370 2597 
21 4499 4726 5061 5349 14 2203 2374 2636 2873 
22 4844 5074 5411 5701 15 2456 2636 2909 3154 
23 5201 5433 5772 6061 16 2718 2905 3189 3441 
24 5573 5805 6144 6430 17 2988 3181 3474 3733 
25 5960 6192 6528 6810 18 3265 3465 3766 4031 
26 6366 6597 6928 7204 19 3551 3756 4064 4335 
27 6797 7024 7347 7614 20 3844 4055 4369 4644 
28 7260 7481 7793 8047 21 4146 4361 4681 4958 
29 7773 7985 8278 8514 22 4457 4676 4999 5279 
30 8381 8577 8843 9050 23 4778 4999 5325 5606 
24 5109 5332 5660 5940 
31 1 1.617 3.242 8.164 16.53 25 5451 5675 6003 6281 
2 33.88 48.60 79.11 115.8 26 5805 6030 6356 6631 
3 112.1 144.4 204.2 269.0 27 6175 6399 6721 6992 
4 225.5 275.4 363.0 453.0 28 6562 6784 7101 7364 
5 365.4 432.1 545.2 657.8 29 6972 7189 7498 7752 
6 526.1 608.5 745.2 878.2 30 7412 7623 7919 8161 
7 704.0 801.1 959.4 1111 31 7898 8099 8378 8602 
8 896.4 1007 1186 1354 32 8474 8660 8911 9106 
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TABLE 1—(cont.) 











0 x .005 010 «= «.025——s«w 050 Nn XxX .005 .010 .025 .050 
6 33 1 1.519 3.045 7.660 15.53 23 4422 © 4634. 's«4048=—Ss«*5 218 
6 2 31.80 45.61 74.26 108.8 24 4722 4936 «= «5 253s« 524 
4 3 105.1 135.4 191.5 252.4 5 5030 «45246 «= s«5564 «5886 
8 4 211.3 258.2 340.3 424.8 26 5348 5565 5883 6154 
8 5 342.2 404.7 510.9 616.6 27 5676 ©5804. s«6210—Ss«6479 
6 492.3 569.6 697.9 822.8 28 «© «6015.—(—«28BCti«éAT:—Ss«CSBIZ 
7 7 658.3 749.4 898.0 1040 29 6360 «6584. s«804—Ss«7154 
6 8 837.8 941.7 1109 1268 30 6738 6951 7255 7507 
0 9 1029 1145 = «1330S s«1508 31 7129 7337 7632 7875 
0 10 1231 1358 1559 1746 32 7548 7749 8032 8262 
6 il 1442-1580) s«:1798—Ss«1995 33 g009 8202 8467 8679 
3 12 1662 1810 2040 2250 34 8557 8733 «= 8972—Ss«)57 
J 13 1800 2047 2201 2611 
4 2127 ©2203 s«24B S278 35 1 1.482 2.871 7.231 14.64 
15 9371 2545 2811 3049 2 20.96 42.97 69.97 102.5 
16 2623 2804 3080 3326 3 98.91 127.6 180.4 237.7 
17 2881 3070 3355 3608 4 198.8 242.9 320.3 399.9 
| 18 3147 3342S 3636=S 3894 56 321.7 380.6 480.6 580.2 
| 19 3419 3621 3922 4185 6 462.7 535.4 656.2 773.9 
! 20 3702 3907 4214 4482 7 618.3 704.0 844.1 978.3 
21 3900 4200 4518 4784 8 786.4 884.3 1042 1191 
22 4287 4501 «© 4818 «= 5092 9 965.3 1075 1249 1412 
23 4503 4809 «= 5129» 5405 10 1154 «:1274S's«1464~—Ss«1640 
24 4907 «5126S s«5448=—Ss«5 724 ll 1351 1481 1685 1878 
25 6231 5452 5774 6050 12 1556 1696 «=-«:1913- «2112 
26 8566 60-5787 «= 6109S s«6383 13 1769 1918 2147 2356 
27 5913 6134 «Ss «6454 06724 M4 1989 2146 2387 2605 
28 6274 6494 6810 7075 15 2216 2380 2632 «=. 2859 
29 6653 6870 7180 7438 16 2450 2621 2882 93117 
30 7053 «7265 «= 7567 «= 7815 17 2690 2868 3138 3379 
31 7482 7688 #7077 8213 18 2936 3121 «= 3399 -S«—«8646 
32 7955 815284248642 19 3189 3379 3665 3917 
33 8517 8607 8942 9132 20 3448 3643 3935 4192 
21 3714 -3013.'s«42NSi(‘tié ATID 
4 11.474) 2.986 7.444 = (15.07 22 3086 4189 «= «4492—s«4756 
2 30.85 44.25 72.05 105.5 23 4265 4472 4779 5045 
8 101.9 131.3 185.8 244.8 24 4551 4761 65072 5339 
4 204.8 250.3 330.0 412.0 25 4846 5058 «= «5370S s«B 638 
56 331.6 302.3 495.3 597.8 26 5148 © 5361. «Ss«5675 = s«B 942 
6 477.0 552.0 676.4 797.6 a7 5450 5674 5987 6253 
7 687.7 726.0 «870.2 += 1008 28 5780 5905 6306 6570 
8 811.3 912.1 1075 1228 29 6113 6326 6635 6894 
9 996.1 1109 1288 1456 30 6458 6670 6974 7228 
10 1191 1315 1510 1691 31 6820 7020 7326 7573 
11 1305 1529 1739 19382 32 7202 7406 7694 7931 
12 1607 1751 «1975s 2.179 33 7611 7808 8084 8309 
13 1828 1980 2217 2431 34 8062 8250 8509 8715 
4 2056 «46022172465 2688 35 8595 8767 9000 819 
15 2291 2460 2719 2951 
16 2538 2700 2078 #3718 36 1 1.302 «2.701 «7.080 = 14.24 
17 9782 2965 3243 3490 2 20.11 41.76 68.00 99.61 
18 3038 3227 3513 3766 3 96.10 123.9 175.3 231.0 
19 3300 3495 3789 4047 4 193.1 235.9 311.2 388.5 
20 3570 3770 4070 4332 5 312.4 369.6 466.8 563.6 
21 3847 4051 4357 4623 6 449.1 519.8 637.2 751.6 
22 41381 4330 4649 «©4918 7 600.1 683.3 819.4 949.9 














838 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 
TABLE 1—(cont.) 








N x -005 -010 025 -050 N x -005 -010 025 -050 
8 763.0 858.1 1012 1157 27 5077 5284 5588 5859 
9 963.3 1043 1212 1371 28 5368 5576 5880 6140 

10 1119 1236 1420 1591 29 5667 5876 6179 6436 
11 1310 1436 1635 1818 30 5975 6183 6485 6739 
12 1509 1644 1856 2049 31 6294 6501 6799 7048 
13 1714 1859 2082 2285 32 6626 6830 7123 7367 
14 1927 2079 2314 2526 33 6972 7172 7548 7695 
15 2146 2306 2551 2772 34 7337 7532 7809 8036 
16 2371 2539 2793 3022 35 7727 7916 8181 8395 
17 2603 2777 3040 3275 36 8158 8336 8584 8781 
18 2841 3021 3292 3533 37 8666 8830 9151 9222 
19 3085 3270 3549 3795 
20 3334 3524 3810 4061 38 1 1.319 2.644 6.660 13.49 
21 3590 3784 4076 4331 2 27.56 39.54 64.39 94.32 
22 3851 4050 4347 4605 3 90.93 117.2 165.9 218.6 
23 4119 4322 4622 4883 4 182.6 223.1 294.3 367.6 
24 4393 4599 4903 5166 5 295.3 349.4 441.4 533.1 
25 4675 4883 5190 5454 6 424.3 491.2 602.3 710.7 
26 4964 5174 5482 5746 7 566.6 645.4 774.3 9897.9 
27 5260 5471 5780 6043 8 720.1 810.1 955.4 1093 
28 5566 5777 6085 6347 9 883.3 983.8 1144 1295 
29 5880 6091 6398 6657 10 1055 1166 1340 1503 
30 6206 6416 6719 6973 11 1235 1354 1542 1716 
31 6544 6752 7050 7299 12 1421 1550 1750 1934 
32 6898 7102 7394 7636 13 1615 1751 1963 2156 
33 7271 7471 7753 7985 14 1814 1958 2181 2383 
34 7671 7863 8134 8353 15 2019 2171 2404 2614 
35 Sill 8294 8547 8749 16 2230 2389 2631 3848 
36 8631 8799 9026 9202 17 2447 2612 2862 3087 
18 2669 2840 3098 3329 

37 1 1.355 2.716 6.840 13.85 19 2896 3072 3338 3574 
2 28.32 40.62 66.15 96.89 20 3128 3309 3582 3822 
3 93.44 120.4 170.4 224.6 21 3365 3551 3830 4075 
4 187.7 229.4 302.5 377.8 22 3608 3798 4083 4331 
5 303.6 359.2 453.7 547.9 23 3857 4050 4339 4591 
6 436.3 505.1 619.3 730.6 24 4110 4307 4599 4854 
7 582.9 663.8 796.2 923.2 25 4370 4569 4865 5121 
8 740.9 833.4 982.7 1124 26 4635 4837 5135 5392 
9 909.1 1012 1177 1332 27 4906 5110 5410 5667 

10 1086 1199 1379 1546 28 5185 5390 5690 5947 
11 1271 1394 1587 1765 29 5470 5676 5976 6231 
12 1464 1595 1801 1990 30 5764 5970 6269 6521 
13 1663 1803 2021 2219 31 6067 6271 6568 6817 
14 1869 2017 2246 2453 32 6379 6582 6875 7120 
15 2081 2236 2475 2691 33 6703 6904 7192 7431 
16 2299 2461 2710 2933 34 7042 7239 7520 7751 
17 2522 2691 2949 3178 35 7399 7591 7862 8084 
18 2752 2927 3192 3428 36 7781 7966 8225 8435 
19 2987 3168 2440 3681 37 8202 8377 8619 8811 
20 3227 3413 3692 3938 38 8699 8859 9075 9242 
21 3474 3664 3949 4199 

22 3726 3920 4210 4464 39 1 1.285 2.577 6.490 13.14 
23 3983 4181 4476 4732 2 26.85 38.51 62.72 91.88 
24 4247 4448 4746 2005 3 88.54 114.1 161.5 212.9 
25 4517 4721 5022 5282 4 177.8 217.3 286.6 358.0 
26 4793 4999 5302 5563 5 287.4 340.1 429.7 519.0 
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; 1963 
TABLE 1—(cont.) 

050 vy X 005 .010 .025  .050 N X  .005 .010 025 050 
89 6 412.9 478.0 586.2 601.9 22 = 3306.«S «3577 = 8850 = 4088 
8140 7 551.3 628.0 753.5 874.0 23 «= 3627S «3812 4090-4332 
5436 § 700.5 788.1 929.6 1064 24 = 3863«S «4052 4333°= «4578 
5739 9 859.0 956.9 1113 1260 25 «= 4104« «4206 «= 4581 = 4828 
7048 10 ©—:1026.—S «1133. :1304 = 1462 26 «= 43504344 « 4832 «(5081 
7367 11 =: 1200-Ss«1317,— «15001669 27 = 4600-4797 «= 5087-5337 
7695 12 :1381~S «1506 = 17021881 28 © 4857S « 50555347597 
3036 13 = 1569) «1702,- 1909-2097 29 «5119S «5319 6115861 
395 141762, 1908. 2121-2317 30 © 5388«= «5588 58806129 
781 15 =: 1961S 2109-23375 31 5663 «= 5863S «1556402 
999 16 2165 2320-2557 2769 32-5046 6146 = 64356680 

17 = 2375S 2536 = 2781 = 3000 33 «6237-6436 = 67226063 
49 18 =. 2590-2757 = 3010 3.235 34 © 6537_-—« «6734 = 7017-7253 
39 19 = -2810 2982 32423473 35 = 6849—««7043-« 73207550 
8.6 20 3034 3211 3478 ©3714 36 = 7174.«S «7364 «= 7634 «= 7856 
76 21 8264. 3446 = 37183958 37 «7516S 7701S 79618174 
31 22 3499-3685 3963 «(4206 38 —-7882-—««s«8060)= 8308 = 8509 
0.7 23 «= 3738 = 3928S 42M 4457 39 = 8285-8453 86848868 
7.9 24 «= 3982s «4175—(iiG ATI 40 8759 8913 9119 9278 
093 2 4232 «= 4428« 4718 = (4970 
295 26 46 4487-4686 49795282 41 11,223 2.451 6.1738 12.50 
503 27 47484049 52435497 2 25.52 36.62 59.63 87.36 
716 28 48= «8015-27135 766 3 84.13 108.4 153.5 202.4 
934 29 «= 5289S 5491 «5787 = 6040 4 168.8 206.4 272.3 340.2 
156 30 © 5569S «57726087 ~=—«63118 5 272.8 322.9 + 408.1 493.0 
383 31 5857 «60606354 «6602 6 301.8 453.7 556.6 657.0 
814 32154 635560476892 7 522.9 595.8 715.2 829.8 
848 336460 6660 69477188 8 664.2 747.5 882.1 1010 
087 34 © «6778.«S «697572577492 9 814.3 907.3 1056 1196 
299 35 711073037578 =—(7805 10 972.1 «1074S :1236—1387 
574 3607459 7647)= 7913S 8130 11 s37,—s«:1248-s1422 1583 
309 37 -7833.= «8014S 82688473 12-1308. —S «42716131784 
075 38 = 8245 8416 = 8652884 13 «148516111808 1988 
331 39 © 8730S «8886 «(9098 = (9261 14 1667S 180120082196 
91 15 1855 199522122408 
354 40 «11,253 2.512 6.327 12.81 16 © - 2047S «2194 = 24202623 
21 2 26.17 «37.54 61.14 89.57 17 2244-2397 26322841 
92 3 86.28 «11.2 157.4 207.5 18 ©2447-2004 28473062 
67 4 173.2 211.7 279.3 348.8 19 26522816 = 3066 = 8287 
47 5 279.9 331.3 418.6 505.7 20 2863 «= 3032, 3288 = 3514 
31 6 402.1 465.5 571.0 674.0 21 © 3079S «3252s B14 8744 
1 7 536.7 611.5 733.8 = 851.3 223208 «= 3476 = 87433977 
17 8 681.9 767.3 905.2 1036 23 «3522, 3703S «3976 = 4214 
20 9 836.1 931.4 1084 1227 24 «3750 3935) 42124453 
31 10 998.3 1108. :12691424 25 «= 3983s 41744514604 
51 11 = 11681281, 14601625 26 «= 42204411 4694 = 4939 
34 12 1344S :1466 16561831 27 4462 4655 49415187 
35 13 1526 165518572041 28 © 47094904 51925438 
1 14-1713, «185020632255 29 © 4962,s«5158— 447) 5603 
‘9 15 = 1906. 2051-2273 2473 30 5219 5417 5706 + 5952 

16-2104 S256) 2486 2604 31-8483. 568159706215 
14 17 2307-2465 2704 «2918 32 «5754 S951 = 62396482 
38 18 2516 2679 ©2927 «3146 33 © 6031.—« «6228 «= «6513754 
9 19 ©2728 «2897, 31513377 34 ««6317—«s«G512_—s« 794 = 7082 


0 20 2946 3119 3380 3611 35 6611 6805 7083 7315 
0 21 3169 3346 3613 3848 36 6917 7108 7380 7605 
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N X  .005 .010 .025 .050 N X  .005 010 .025 060 
37 «-7236«S«7492)=—s«7687~=—s(7905 8 631.5 710.8 839.1 960.9 
38 7571 7752 8008 8216 9 773.9 862.5 1004 1138 
39 7929 «= 8104 «Ss 83478548 10 923.6 1021 1176 1320 
40 8323 8488 8715 8894 il 1080 1185 = «13521506 
41 8788 8938 9140 9205 12 1242 1385 s«1533 (1696 

13 1409 ©=—«:1580s«1718 1890 

42 1 1.193 = 2.393) -6.026 12.21 14 1582 1710» 19082088 

2 24.01 35.74 58.20 85.27 15 1759 ©1804 = 2101S (2288 
3 82.09 105.8 149.8 197.5 16 1941 2081 «= 22972492 
4 164.7 201.3 265.6 331.9 17 —«-2127,—Ss«2278=— 2497 = 2698 
5 266.1 315.0 398.1 481.0 18 2318 2469 2701 2907 
6 382.0 442.4 542.8 640.9 19 2513 2669 2908 3120 
7 509.8 581.0 697.4 809.3 20 2711.-=Ss« 2878 = 8118 = 8335 
8 647.4 728.7 860.1 984.7 21 2012 3080 3331 «3553 
9 793.6 884.3 1030 1166 22 «3120-3291. = 8547 = 873 
10 ©947.2«S«:1047—Ss«1205— «1388 23 3331 3505 3766 3996 
ul 1107 ««1216-—(s«1386 S544 24 «= 3545 3723 30884221 
12 1274 1390 1572 1739 25 «= 3763S (si8044 A244 
13 1446 ©=-1570s«1762 «(1938 26 46 «3084S «4168 = 44424679 
14 1624 1754 1957 2141 27 «= 4210S ««4396= 46734912 
15 1806 194321552347 28 40s «4441—i(i‘2D «4008148 
16 1991 2136 2357 2556 29 «©: 4676«S ss: 48660=— «51465387 
17 —«-2183 «2334 = 2563-2768 30 4915 5106 5388 5629 
18 2381 2535 2773 2983 31 5159 5351 5633 5874 
19 2581 2740 2984 3201 32 5408 5601 5883 6123 
20 2785 + «=. 2950s «32003422 33 5663 5856 6137 6375 
21 «2004S «3164 = 3420S 3646 34 «5924 = «116 «= 6306 =—6632 
22 3207 3381 3642 3872 35 6192 6383 6660 4893 
23 © 3424 «Ss 3802-3868 = 4102 36 6467S «665769307150 
24 «3645S 8826 «= 4097 «4333 37. «6751s 698872077431 
25 3870 4054 4329 4568 38 © 7045 «= 7229=S 7492 = 7709 
26 4090 4286 4564 4806 39 7351 7531 7787 7996 
27 ««s«4333,=«i«<“‘«wS DiS 40 7673 7848 8004 8204 
28 4571 4762 5045 5289 41 8018 8185 8419 8607 
29 © 4814. «S«5007-«5202—Ss«5536 42 8396 8554 8771 8944 
30 5062 5257 5542 5786 43 8841 8984 9178 9327 

31 5316 5511 5796 6039 

32 «5575 = «5770S 0556297 44 1 1.139 -2.284 «5.752 11.65 
33 «5841 «6086 = «6319 «6559 2 3.77 34.09 55.53 81.36 
34 6113 6307 6588 6825 3 78.28 100.9 142.9 188.4 
35 6394 6586 6864 7007 4 157.0 191.9 253.3 316.5 
36 6683 6873 7146 7374 5 253.6 300.2 379.4 458.6 
37 «««6982.—«é<“‘z74GO«=«C«C7437~=s«(76858 6 363.9 421.5 6517.3 610.9 
38 7205 7478 7738 7952 7 485.5 553.3 664.4 771.3 
39 7623 7801 8052 8256 8 616.4 693.8 819.2 938.2 
40 7974 8145 8384 8576 9 755.2 841.8 980.4 1111 
41 8360 8521 8743 8919 10 901.2 996.8 = 11471288 
42 «8815s 8962~—S«s«éSD «9312 1 1053 1157'S s«1319=—:1470 
12 1211 1322 1496 1655 
43 1 1.166 © 2.887 «5.886 11.92 13 1375 ©=—-:1492S («1676 = s«1845 
2 24.32 34.90 568.3 83.27 14 1543 1667 +«—-1861 2037 
8 80.14 103.3 146.3 192.8 15 1715 1847-2049 S233 
4 160.7 196.5 259.3 324.0 16 1802 «=: 2080 '=««224kS 2431 
5 259.7 307.4 388.5 469.5 17 «-:2078.-——s«2216 = -2436 = 2632 
6 372.8 431.7 529.8 625.5 18 2259 2406 2634 2836 
7 497.3 566.8 680.5 789.9 19 © - 2448 «Ss 2601 «2835 = (3048 
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ee —_— 
-050 N D .005 .010 .025 050 N xX .005 .010 .025 -050 
ee | 
960.9 20 2641 2799 © 3039 3252 31 4873 5060 5335 5571 
1138 21 2838 3000 3247 3464 32 5105 5293 5569 5804 
1320 22 3038 3205 3457 3678 33 5342 5530 5806 6040 
1506 23 3242 3414 3670 3895 34 5583 5771 6046 6280 
1696 24 3450 3625 3886 4114 35 5829 6018 6291 6523 
1890 25 3662 3839 4104 4335 36 6081 6269 6540 6770 
2088 26 3877 4057 4325 4559 37 6340 6526 40-6794 7021 
2288 27 4095 4279 4550 4786 38 6606 6790 7055 7276 
2492 28 4318 4504 4778 5015 39 6879 7061 6321 7537 
2698 29 4545 4732 5008 5247 40 7162 7341 7595 7805 
2907 30 4776 4965 5242 5481 41 7457 7631 7878 8080 
3120 31 5012 5201 5480 5718 42 7768 7936 8173 8366 
3335 32 5252 5442 5721 5959 43 8099 8260 8485 8666 
3553 33 5497 5688 5966 ©—-_-6 208 44 8462 8614 8823 8989 
3773 34 5748 5939 6216 6451 45 8889 9027 9213 9356 
3996 35 6004 6194 6470 6702 
4221 36 6267 6456 6729 6958 46 1 1.090 2.185 5.502 11.14 
4449 37 6538 6725 6994 7219 2 22.72 32.60 53.10 77.80 
4679 38 6816 7001 7265 7485 3 74.81 96.45 136.6 180.1 
4912 39 7105 7286 7545 7758 4 150.0 183.4 242.0 302.5 
5148 40 7405 7582 7833 8039 5 242.2 286.7 362.5 438.2 
5387 41 7722 7893 8135 8331 6 347.6 402.5 494.1 583.6 
5629 42 8059 8223 8453 8638 7 463.4 528.2 634.5 736.7 
5874 43 8430 8584 8798 9867 8 588.1 662.2 782.0 895.9 
6123 44 8866 9006 9196 9342 9 720.4 803.2 935.7 1061 
6375 10 859.4 950.3 1095 1230 
6632 45 1 1.114 2.233 5.625 11.39 11 1004 1103 1259 1403 
5893 2 23.23 33.30 54.29 79.54 12 1154 1260 1427 1580 
7159 3 76.51 98.63 139.6 184.2 13 1310 1423 1599 1760 
7431 4 153.4 187.5 247.5 309.3 14 1469 1589 1774 1943 
7709 5 247.7 293.38 370.8 448.2 15 1633 1759 1954 2129 
7996 6 355.5 411.8 505.4 596.9 16 1801 1933 2136 2318 
8294 7 474.2 540.5 649.1 753.6 17 1973 2110 2321 2509 
8607 8 601.9 677.6 800.2 916.6 18 2149 2291 2509 2703 
8944 9 737.4 822.0 957.5 1085 19 2328 2475 2700 ©2900 
9327 10 879.8 972.7 1121 1258 20 2511 2663 2894 3098 
ll 1028 1129 1288 1436 21 2697 2854 3090 3299 
11.65 12 1182 1291 1460 1617 22 2887 3048 #3289 3502 
81.36 13 1341 1457 1637 1801 23 3080 3245 3491 3708 
188.4 14 1505 1627 1817 1989 24 3276 3444 3695 3916 
316.5 15 1673 1802 2000 2180 25 3475 3646 3902 4126 
458.6 16 1846 1980 2186 © ©—-2373 26 3678 3852 4111 4338 
610.9 17 2022 2162 2375 2569 27 3884 4061 4323 4552 
771.3 18 2202 2348 2568 2768 28 4093 4272 4538 4768 
938.2 19 2386 2537 2765 2969 29 4306 4487 4756 4987 
1111 20 2574 2729 2964 3173 30 4523 4705 4976 5208 
1288 21 2766 2925 3166 3380 31 4743 4927 5198 5432 
1470 22 2961 3124 3371 3588 32 4967-6152 5424 5658 
1655 23 3159 3326 3578 3799 33 5195 5382 5654 5887 
1845 24 3361 3532 3788 4012 34 5428 5615 5887 6119 
2037 25 3566 3740 4001 4228 35 5666 5852 6123 6354 
2233 26 3775 3952 4216 4446 36 5908 6094 6363 6592 
2431 27 3987 4167 4434 4666 37 6156 6341 6608 6834 
2632 28 4203 4385 4655 4888 38 6410 6594 6858 7080 
2836 29 4423 4607 4878 5113 39 6671 6853 7113 7331 


3043 30 4646 4832 5105 5341 40 6940 7119 7374 7587 
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TABLE 1—(cont.) 








005 -010 -025 -005 -010 050 
7218 7393 7643 71.64 92.36 172.5 
7507 7679 7921 143.6 175.5 289.7 
7812 7978 8210 231.7 274.4 419.5 
8137 8295 8516 332.4 385.1 585.6 
8493 8642 8847 443.2 505.3 705.0 
8912 9047 9229 562.3 633.3 857.3 
688.7 767.9 1015 
1.066 2.138 5.385 821.3 908.4 1176 
22.23 31.90 51.96 959.5 1054 1342 
73.19 94.36 133.6 1103 1204 1510 
146.7 179.4 236.8 1251 1359 1683 
236.8 280.4 354.6 1403 1517 1857 
339.8 393.6 483.2 1559 1679 2035 
453.1 516.5 620.5 1719 1845 2215 
574.9 647.4 764.7 1883 2014 2398 
704.2 785.2 914.9 2050 2186 2583 
839.9 928.9 1070 2221 2361 2770 
981.3 1078 1230 2394 2539 2959 
1128 1232 1395 2571 2721 3150 
1280 1390 1562 2751 2904 3344 
1435 1552 1734 2934 3091 3540 
1595 1718 1909 3119 3280 3737 
1759 1888 2087 3307 3472 3936 
1926 2061 2268 3499 3667 4137 
2097 2237 2451 3694 3864 4340 
2272 2417 2637 3891 4065 4546 
2451 2600 2826 4092 4267 4753 
2633 2786 3018 4296 4473 4962 
2818 2974 3212 4503 4682 5174 
3005 3166 3408 4714 4894 5387 
3195 3360 3607 4928 5109 5603 
3389 3557 3808 5145 5327 5821 
3587 3757 4012 5366 5549 6043 
3787 3960 4218 5592 5774 6267 
3990 4166 4427 5822 6004 6493 
4196 4375 4639 6057 6238 6723 
4406 4586 4853 6296 6477 6956 
4620 4801 5069 6542 6721 7193 
4837 5020 5288 6794 6971 7435 
5057 5242 5510 7054 7228 7681 
283 5467 5736 7322 7493 7934 
5512 5696 5965 7602 7768 8194 
5745 5929 6197 7896 8056 8364 
5984 6167 6434 8209 8362 8746 
6228 6410 6674 8552 8695 9049 
6477 6659 6919 8955 9085 9395 
6734 6913 7170 
6998 7174 7426 1 1.023 2.051 10.46 
7271 7444 7690 2 21.32 30.58 73.01 
7556 7724 7962 3 70.15 90.45 168.9 
7855 8018 8246 4 140.6 171.9 283.6 
8173 8329 8546 5 226.9 268.6 410.8 
8523 8669 8871 6 325.4 377.0 546.9 
8934 9067 9245 7 433.7 4094.5 690.2 
8 550.3 619.7 839.2 
1.044 2.094 5.273 9 673.8 751.4 993.1 
21.77 =31.23 = 50.87 10 803.5 888.8 1151 














1953 





INCOMPLETE BETA FUNCTIONS 


TABLE 1—(cont.) 








x -005 010 025 -050 N Xx -005 -010 -025 -050 
11 938.6 1031 1177 1313 6 318.6 369.2 453.4 535.7 
12 1079 1178 1334 1478 7 424.7 484.3 581.9 676.0 
13 1223 1329 1495 1646 8 538.8 606.8 717.0 821.8 
14 1372 1484 1658 1817 9 659.6 735.7 857.6 972.5 
15 1524 1643 1825 1991 10 786.5 870.0 1003 1127 
16 1680 1804 1995 2168 11 918.6 1009 1153 1286 
17 1840 1969 2167 2346 12 1055 1153 1306 1447 
18 2003 2137 2342 2526 13 1197 1301 1463 1612 
19 2169 2308 2520 2709 14 1342 1452 1623 1779 
20 2339 2482 2700 2894 15 1491 1607 1786 1949 
21 2511 2659 2882 3081 16 1645 1765 1952 2121 
22 2686 2838 3067 3270 17 1800 1926 2120 2295 
23 2865 3020 3254 3461 18 1959 2090 2291 2472 
24 3046 3204 3444 3654 19 2122 2257 2465 2651 
25 3229 3391 3635 3848 20 2286 2427 2641 2831 
26 3416 3581 3828 4044 21 2456 2599 2819 3014 
27 3606 3773 4024 4242 22 2626 2775 2999 3199 
28 3798 3968 4222 4442 23 2799 2953 3182 3385 
29 3993 4166 4422 4644 24 2977 3132 3367 3573 
30 4191 4366 4624 4848 25 3156 3314 3554 3763 
31 4393 4569 4829 5054 26 3338 3499 3742 3955 
32 4597 4775 5037 5262 27 3522 3687 3933 4149 
33 4805 4983 5247 5473 28 3709 3877 4126 4344 
34 5015 5195 5459 5685 29 3899 4069 4321 4541 
35 5229 5410 5674 5899 30 4092 4263 4518 4739 
36 5448 5628 5892 6117 31 4288 4461 4718 4939 
37 5670 5850 6113 6337 32 4486 4661 4920 5142 
38 5896 6076 6338 6559 33 4688 4864 5124 5347 
39 6127 6306 6566 6785 34 4893 5070 5331 5554 
40 6363 6541 6798 7014 35 5100 5278 5540 5763 
41 6605 6781 7034 7247 36 5311 5489 5752 5974 
42 6853 7027 7276 7484 37 5526 5705 5967 6188 
43 7108 7279 7523 7726 38 5745 5923 6183 6404 
44 7372 7539 777 7974 39 5967 6145 6404 6622 
45 7647 7810 8040 8229 40 6195 6373 6629 6844 
46 7935 8092 8313 8493 41 6428 6603 6856 7069 
47 8242 8393 8602 8770 42 6665 6839 7089 7298 
48 8580 8721 8915 9068 43 6909 7081 7326 7531 
49 8975 9103 9275 9407 44 7160 7329 7569 7769 
45 7420 7585 7819 8012 
1 1.002 2.010 5.062 10.25 46 7690 7850 8077 8262 
2 20.89 29.97 48.82 71.54 47 7973 8128 8345 8522 
3 68.73 88.61 125.5 165.5 48 8275 8423 8629 8794 
4 137.7 168.4 222.3 277.9 49 8606 8745 8935 9086 
5 222.2 263.1 332.7 402.4 50 8995 9120 9289 9418 

















BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 
AND RELATED TOPICS 


I. RicHarp SAvaGE 
National Bureau of Standards 


This bibliography contains 999 references on nonparametric 
statistics and related topics, classified as follows: (A) Surveys 
and Discussions (39), (B) Theory (31), (C) Tchebycheff In- 
equalities (94), (D) Tolerance Sets (21), (E) Goodness of 
Fit (122), (F) Multisample Problems (53), (G) Parameter 
Problems (135), (H) Contingency Tables (75), (I) Randomness 
(109), (J) Correlation and Curve Fitting (96), (K) Compara- 
tive Studies (49), (L) Systematic Statistics (127), (M) Scaling 
(37), (N) Distribution Theory (383), (O) Applications (89), 
(P) Tables (228), (X) Miscellaneous (28). 


INTRODUCTION 


ONPARAMETRIC statistics has recently become an important special 

field of statistics. Papers related to nonparametric problems were 
published in the nineteenth century, but the true beginning of the sub- 
ject may be taken as 1936, the year in which Hotelling and Pabst pub- 
lished their paper on rank correlation. By 1943 the literature had be- 
come extensive enough to warrant a review article by Scheffé. Now the 
number of papers concerned with nonparametric statistics appearing 
in statistical journals is very large; in fact these articles are taking a 
considerable portion of the available space. Consequently supplements 
to this bibliography will be issued in order to keep it up to date. 

In spite of the abundance of nonparametric literature, there is no 
generally accepted definition of the field. It is not always clear which 
techniques, problems, and theories are nonparametric. In the prepara- 
tion of this bibliography over-inclusion has been deemed better than 
omission of titles that might be of use to those interested in border-line 
aspects of nonparametric statistics. 

Entries in the bibliography are arranged alphabetically by author, 
and chronologically within authors. After each entry one or several let- 
ters appear, indicating the categories in the following list to which the 
entry belongs: 


Surveys and Discussions 
Theory 

Tchebycheff Inequalities 
Tolerance Sets 
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Goodness of Fit 

Multisample Problems 

Parameter Problems 

. Contingency Tables 

Randomness 

Correlation and Curve Fitting 

. Comparative Studies 
Systematic Statistics 

. Scaling 

. Distribution Theory 

Applications 

Tables 

. Miscellaneous 


MMOAZBMP AS Oss 


Following many of the entries appears a sequence of digits of the form 
zy-abe; this means that the entry was reviewed in Mathematical Re- 
views in the year 19zxy on page abc. 


A. Surveys and Discussions 


Since nonparametric statistics has existed as a special field only about 
fifteen years, surveys and discussions of the general subject are scarce. 
Two important papers of this type are by Scheffé [1943b] and Wolfo- 
witz [1949]. These papers give a comprehensive view of the problems 
and results obtained up to their time of publication. Pitman [1948] and 
Hemelrijk et al. [1951] have sets of lecture notes devoted to nonpara- 
metric statistics. Wilks [1948] covered many of the problems of non- 
parametric statistics in his discussion of order statistics. Wallis [1952] 
gave a brief introduction to the subject and its applications. Most of 
the other papers given this classification are specialized and their cross 
classifications will better indicate their content. 


B. Theory 


There does not exist a unified theory of nonparametric statistics, 
but there have appeared theoretical approaches to some of the special- 
ized problems. The structure of critical regions with optimum proper- 
ties was discussed by Feller [1938], Scheffé [1943a], Lehmann and Stein 
[1949], Hoeffding [1951b], and Lehmann [1951]. The use of “maximum 
likelihood” in the nonparametric theory was introduced by Wolfowitz 
[1942]; Levene [1952] made further use of this concept. Hodges and 
Lehmann [1950] gave some nonparametric estimators which are opti- 
mum in terms of minimax theory. 
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C. Tchebycheff Inequalities 


Tchebycheff inequalities are included in the bibliography since they 
allow one to make probability statements when only a small amount 
of a priori information (usually several moments) is given about the 
distributions involved. Fréchet [1950] presented a review which included 
many of the inequalities of the Tchebycheff type. Godwin [1944] gave 
an English summary of the Fréchet material. Guttman [1948b] and 
Midzuno [1950] introduced inequalities using higher sample moments, 
which gave shorter average confidence intervals than the usual in- 
equalities. 


D. Tolerance Sets 


Wilks [1941] gave the first presentation of nonparametric tolerance 
limits. There have been subsequent generalizations of the theory of 
tolerance limits treating multivariate samples, irregularly shaped re- 
gions, discontinuous cases, and sequential cases. A recent paper in this 
field is by Fraser [1953a]. 


E. Goodness of Fit 


A goodness of fit test has the following properties: (1) It is defined 
for samples from some large class of distributions, such as all continuous 
univariate distributions. (2) The null hypothesis is either some specified 
distribution or a class of distributions of which the functional form is 
known. (3) For all null hypotheses the test statistic used has the same 
distribution (at least in the limit). (4) The test is consistent. 

The first goodness of fit test, chi-square, was introduced by K. Pear- 
son {1900]. Since then many new procedures have been presented. Cur- 
rently, there is much interest in the Kolmogorov-Smirnov tests and 
related topics. These have been summarized by Anderson and Darling 
[1952]. Some progress has been made in devising goodness of fit tests 
for multivariate problems; see papers by P. B. Simpson [1951] and 
Rosenblatt [1952a]. There has been little justification for the proposed 
goodness of fit procedures, but Neyman [1937], Mann and Wald [1942], 
and Wolfowitz [1942] gave procedures having optimum properties 
other than consistency. 


F. Multisample Problems 


Multisample problems or multisample goodness of fit problems in- 
volve testing the hypothesis that several samples come from the same 
population. Solutions to these problems should satisfy the following 
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conditions: (1) Under the null hypothesis the test statistic is distribu- 
tion-free (at least in the limit). (2) The procedure is consistent for a 
large class of alternatives. 

An early approach to this problem was K. Pearson’s [1911] two sam- 
ple chi-square test. Since then, many procedures have been introduced. 
Recently, there have been many investigations of tests related to the 
Kolmogorov-Smirnov test; an example is the work of Anderson and 
Darling [1952]. Tests having optimum properties other than consist- 
ency were suggested by Wolfowitz [1942] and Lehmann [1951]. 


G. Parameter Problems 


Parameter problems include estimation and testing procedures deal- 
ing with location and scale parameters. Although these problems in- 
volve parameters, they are nonparametric since (1) the parameters are 
defined for large classes of distributions, and (2) the proposed proce- 
dures lead to probability statements that are distribution-free. 

A typical parameter problem is the testing of the hypothesis that a 
sample comes from a distribution with some specified percentage point. 
This problem has received extensive treatment by K. R. Nair [1940b], 
Steward [1941], Dixon and Mood [1946], Noether [1948, 1951] and 
Walsh [1949c, 1951a]. Testing the hypothesis that two samples come 
from populations with the same median has been examined by many 
authors beginning with the work of Wilcoxon [1945] and summarized 
by Kruskal and Wallis [1952]. Nonparametric analysis of variance has 
been discussed by Pitman [1937c], Friedman [1937], G. W. Brown and 
Mood [1951], and Terry and Bradley [1952c]. 

Optimum procedures have been investigated by Wolfowitz [1942], 
Lehmann and Stein [1949], Hodges and Lehmann [1950], and Hoeffding 
[1951b]. 


H. Contingency Tables 


Contingency tables are the conventional rs tables used to cross- 
classify data. Techniques using contingency tables include the analysis 
of association and tests of goodness of fit. Many of these techniques are 
distribution-free, since they are based on conditional distributions of 
the sample. 

An interesting theoretical treatment of contingency tables was 
made by Fisher [1948]. Papers by E. S. Pearson [1947] and Barnard 
[1947a, 1947b] constitute a survey of the field. Mainland [1948] gave 
extensive applications and tables. 
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I. Randomness 


In many situations it is desirable to examine the assumption of 
randomness. This involves testing the hypothesis that a sequence of 
observations was made on independently and identically distributed 
random variables. Wald and Wolfowitz [1943] and Levene [1952] pre- 
sented many procedures of this type. 


J. Correlation and Curve Fitting 


Problems of correlation and curve fitting are properly classified as 
parameter problems (G) but the wealth of literature on this subject 
justifies a separate class. A fundamental paper on nonparametric 
correlation is the treatment of Spearman’s coefficient by Hotelling 
and Pabst [1936]. A small treatise on many rank correlation methods 
was prepared by Kendall [1948a]. Typical papers on curve fitting are 
by K. R. Nair and Shrivistava [1942] and K. R. Nair and Banerjee 
[1942]. 


K. Comparative Studies 


The work of F. N. David and Johnson [195la, 1951b] on the distri- 
bution of the F statistic under non-normal conditions is typical of 
the material given this classification. Emphasis is placed on finding the 
operating characteristics of “normal” statistics under non-normal con- 
ditions rather than on the development of distribution-free statistics. 
These papers are nonparametric, since they show which of the para- 
metric procedures have operating characteristics which are not strongly 
dependent on the specific parametric assumptions that were used in 
their development. 


L. Systematic Statistics 


Mosteller [1946] introduced the term “systematic statistics” when 
referring to linear functions of the order statistics of a sample. Much 
of the theory of these statistics has involved the assumption of nor- 
mality. Nevertheless, these techniques have two things in common with 
nonparametric techniques: ease of computation, and “inefficiency.” 
Dixon and Massey [1951] summarized many of the uses of systematic 
statistics. 


M. Scaling 


Many statistical problems involve the measurement or the com- 
parison of objects where the units of measurement are not well defined, 
for instance in measuring tastes. Hence either artificial scales are de- 
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veloped or scales are avoided by using ranks. The reports by Bradley 
and Duncarm [1950], and Bradley and Terry [195la, 1951b] contain 
much information on this subject. 


N. Distribution Theory 


The development of nonparametric procedures involves many dis- 
tribution problems. Mood [1940] found many of the distributions that 
are connected with run theory. Wald and Wolfowitz [1944] developed 
limit theorems needed in the theory of statistics based on the method 
of randomization. Hoeffding and Robbins [1948] gave special central 
limit theorems useful in developing tests of randomness. 


0. Applications 


Since nonparametric statistics has developed recently there have 
been few published applications. However, most theoretical papers 
give illustrations of the methods being presented. 


P. Tables 


Once a nonparametric technique has been developed it is often useful 
to have special tables to facilitate applications. An example is the 
tables by Swed and Eisenhart [1943], giving the distribution of runs. 


X. Miscellaneous 


In spite of the many classifications given, a few papers remain un- 
classified. Papers by Wallis [1942] on the combination of independent 
tests and Tukey [1946] on inequalities for deviations from the median 
are typical of these miscellaneous papers. 
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Cramér (1928b); F. N. David (1934, 1939, 1947a, 1948, 1950a, 1950b); F. N. 
David and Johnson (1948); Donsker (1952); Doob (1949); DuBois (1935); 
Edwards (1950); Eisenhart (1935, 1937, 1938); Eisenhart and Wilson (1943); 
Elderton (1901, 1927); Feller (1948); Feuell and Rybicka (1951); Fisher (1923, 
1924, 1928, 1950); Fiske and Dunlap (1945); Fraser (1950); Fry (1938); Geary 
(1947); Glivenko (1933); Griineberg and Haldane (1937); Gumbel (1942, 1948); 
Hald and Sinkbaek (1950); Haldane (1937); Hannan (1950); Hemelrijk (1950a, 
1950b, 1950c, 1950d); Hoel (1938); Kac (1949); Kimball (1947); Kolmogorov 
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(1933b, 1941); Lancaster (1950); P. C. V. Lesser (1933); Lewis and Burke (1949, 
1950); Malmaqvist (1950); Maniya (1949); Mann and Wald (1942); Massey 
(1950a, 1950b, 1951a, 1952c); von Mises (1931b, 1947); Moroney (1951); K. R. 
Nair (1937); Neyman (1935, 1937, 1940, 1949); Neyman and Pearson (1928, 
1930); Okamoto (1952); Olmstead (1940); Pastore (1950); Patnaik (1949); E. §. 
Pearson (1938a); K. Pearson (1900, 1922, 1924, 1932a, 1933a, 1933b, 1934); 
Peters (1950) ; Poti (1950); Robinson (1933); Rosenblatt (1952a, 1952b); Sawkins 
(1941); Seal (1947, 1948); Shanawany (1936); Sherman (1950); P. B. Simpson 
(1951); Smirnov (1936, 1937, 1948); P. V. Sukhatme (1938); Tippett (1952); 
Wald and Wolfowitz (1939, 1941); Vora (1951); A. M. Walker (1950, 1952); 
Waschakidse (1938); Williams (1950); Wilson (1952); Wold (1938); Wolfowitz 
(1942); Yule (1922); Yule and Kendall (1950). 


F. Multisample Problems (54) 


T. W. Anderson and Darling (1952); Bhattacharyya (1943, 1946); Bowker 
(1944); van Dantzig (195la); Dixon (1940); Doob (1949); Drion (1952); Feller 
(1948); R. S. Gardner (1950); Gihman (1952); Gnedenko (1952); Gnedenko 
and Korolyuk (1951); Gnedenko and Mihalevit (1952a, 1952b); Gnedenko and 
Rvateva (1952); Hemelrijk (1950a, 1950b, 1950c, 1950d, 1951); Hemelrijk, et 
al. (1951); Kolmogorov (1941); Krishna Iyer (195la, 1952b); Kunisawa, Ma- 
kabe and Morimura (1951); Kvit (1950); Lal (1952); Lehmann (1951); Marshall 
(1951); Massey (1951a, 1951b, 1951c, 1952b); Mathen (1946); Mathisen (1943); 
Mihalevit (1952); Neyman and Pearson (1928, 1930); Nybolle (1936); K. Pear- 
son (1911, 1932b); Petrov (1951); Rhodes (1924); Smirnov (1939a, 1939b, 1948); 
Tippett (1952); Tsao (1952); van der Vaart (1950a); Wald and Wolfowitz 
(1940, 1941); Wolfowitz (1942, 1949). 


G. Parameter Problems (1386) 


Armitage (1944); Barnard (1943a); Blomqvist (1951); Bradley (1952b); 
Bradley and Terry (195lb, 1952a, 1952b, 1952c); G. W. Brown and Mood 
(1951); Cochran (1937b, 1943, 1950); L. C. Cole (1945); F. N. David and John- 
son (195la, 1951b, 1951c, 1952); H. A. David (1951); Dixon (1952); Dixon and 
Mood (1946); Drion (1951) Durbin (1951) Dwass (1952); Eddison, et al. (1951); 
Ehrenberg (1951); Eisenhart (1947); Epstein (1953); Evans (1942); Festinger 
(1943, 1946); Fisher (1948, 1949); Fix and Hodges (1951, 1952); Fraser (1952); 
Freeman and Halton (1951); Friedman (1937); Gayen (1950a, 1950b); Guttman 
(1948b); Halmos (1946); Hartley (1950a, 1950b); Hayashi (1950); Hemelrijk 
(1950b, 1950c, 1950d, 1952b, 1952c); Hemelrijk, et al. (1951); Hodges and Leh- 
mann (1950); Jeeves and Richards (1950); Kemperman (1950); Kempthorne 
(1952); Kruskal (1952); Kruskal and Wallis (1952); Lehmann (1953); Lloyd 
(1952); Mann and Whitney (1947) ; Marshall (1951); Marshall and Walsh (1950); 
Massey (1952a, 1952d); Mathematical Centre (1952); Mood (1950); Moroney 
(1951); Moses (1952b); Mosteller (1948); Mosteller and Tukey (1950); A. N. K. 
Nair (1941); K. R. Nair (1940a, 1940b, 1948a); Noether (1948, 1951); E. S. 
Pearson (1931, 1950b); Pillai (1951); Pitman (1937a, 1937c, 1939); Rijkoort 
(1952); Savur (1937, 1938); Schultz (1945); Snedecor (1953); Stevens (1948, 
1951); Stewart (1941); Stuart (1951); Swineford (1946); Terry (1952); Thomp- 
son (1936); Tippett (1952); Tukey (1949b, 1949c, 1950); Tweedie (1953); van 
der Vaart (1950b); van der Waerden (1952, 1953); Wallis (1939); Walsh (1946a, 
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1946c, 1946d, 1947, 1949a, 1949b, 1949c, 1950a, 1950b, 1950c, 1951a, 1951b, 
1952a, 1952b, 1953); Welch (1937, 1938); Westenberg (1948, 1950a, 1950b, 1952) ; 
White (1952); Whitney (1948, 1951); Wilcoxon (1945, 1946, 1947, 1949, 1950); 
Wilks (1940); Wilson (1952); Woodruff (1952); Yates (1951); Zubin (1939). 


H. Contingency Tables (76) 


Adler (1951); Allinson and Bates (1944); Barnard (1945, 1947a, 1947b); 
Bonnier (1942); Carroll and Bennett (1950) ; Cochran (1952); C. C. Craig (1953); 
Crow (1952); Edwards (1950); Eisenhart (1935); Federighi (1950); Finney 
(1948); Fisher (1922, 1926a, 1935, 1941, 1945); Freeman and Halton (1951); 
Fulcher (1942); Gebelein (1942); Geppert (1944); Gildemeister and Waerden 
(1943); Goodman and Kruskal (1953); Haldane (1939, 1940); Irwin (1935, 
1949); Irwin and Snedecor (1933); Jeffreys (1937); Jurgensen (1947); Kendall 
(1948b); Kermack and McKendrick (1940); Kondo (1939); Lancaster (1949, 
1951); Leslie (1951); Lewis and Burke (1949); Lombard and Doering (1947); 
Mainland (1948, 1952); Mainland and Murray (1952); Mood (1949); Moroney 
(1951); Neyman and Pearson (1928); Patnaik (1948); E. S. Pearson (1947); 
E. 8. Pearson and Merrington (1948); K. Pearson (1916a); K. Pearson and 
Heron (1913); Pompilj (1952); Rao (1952); Sillitto (1949); B. H. Simpson 
(1951); Skory (1952); C. A. B. Smith (1951); C. D. Smith (1951); Snedecor and 
Irwin (1933); Stevens (1938, 1951); Swaroop (1938); Swineford (1948); Tocher 
(1950) ; Wilks (1935); Wilson (1941, 1942); Wilson and Worcester (1942a, 1942b) ; 
Yates (1934, 1948); Youden (1950); Yule (1912, 1922); Yule and Kendall 
(1950). 


I, Randomness (109) 


Armitage, Baines and Lindley (1944); Bartlett (1951, 1952); Bateman (1948); 
B. M. Bennett (1952); C. A. Bennett (1951); J. Bertrand (1875); Besson (1920); 
Bienaymé (1875); Bilham (1926); Borel (1933); Bose (1946, 1950); B. Brown 
(1948); Cameron (1952); Campbell (1942); Cochran (1936, 1938); Cowles and | 
Jones (1937); Dantzig (1939); F. N. David (1947b); H. T. Davis (1941); Dodd 
(1939, 1941, 1942); Eisenhart and Wilson (1943); Elfving and Whitlock (1950); 
Finney (1942, 1947); Fisher (1926b); Freund (1951); Gage (1943); Ghosh (1948); 
Gleissberg (1945a, 1945b); Gold (1929); Good (1953); Goodman (1952); Grant 
(1952); J. A. Greenwood (1946); Hald (1952); Haldane and Smith (1948); 
Housner and Brennan (1948); N. L. Johnson (1948); H. E. Jones (1937); 
Juncosa (1949); Keeping (1952) ; Kendall and Smith (1938, 1939a, 1939b); 
Kermack and McKendrick (1937a, 1937b); Krishna Iyer (1948b, 1952a); Kuz- 
nets (1929); Levene (1946a, 1946b, 1952); Levene and Wolfowitz (1944); Lowry 
(1951); Mahalanobis (1944); Mann (1945a, 1945b, 1950); Marbe (1926, 1934); 
Mihoc (1943); G. H. Moore and Wallis (1943); P. G. Moore (1949); Moran 
(1947a, 1948c, 195la); Mosteller (1941); K. R. Nair (1938); Noether (1950); 
Olds (1949); Olmstead (1942, 1946); van der Plank (1947); Rosander (1942); 
von Schelling (1939); Shewhart (1941); Silberstein (1945); Singh (1952); 
Stevens (1937); Stuart (1952); B. V. Sukhatme (1949a, 1949b); Terpstra 
(1952a, 1952b); M. Thomas (1951); Tintner (1952); Tippett (1927); Todd 
(1940); Ville (1943a, 1943b); Wald and Wolfowitz (1943); Waldapfel (1943); 
A. M. Walker (1950, 1952); H. M. Walker (1929); Wallis and Moore (1941a, 
1941b); Wilson (1952); Wold (1948); Wolfowitz (1942); Young (1941); Yule 
(1938); Yule and Kendall (1950). 
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J. Correlation and Curve Fitting (96) 


Bartlett (1949); Blomqvist (1950); A. Brown and Stanley (n.d.); Chown and 
Moran (1951); Cochran (1937b); Cramér (1924); Crist (1940); Cronbach and 
Glosser (1952); Daniels (1943, 1950, 1951); Daniels and Kendall (1947); F. N, 
David (1950c); 8. T. David, Kendall and Stuart (1951); Deming (1938); Drion 
(1951); DuBois (1939); Durbin and Stuart (1951); Eells (1929); Esscher (1924); 
Gayen (1951); Geary (1953); Guilford (1941); Hemelrijk (1949a, 1949b, 1950e); 
Hemelrijk, et al. (1951); Hoeffding (1940, 1941, 1942, 1947, 1948b, 1952b); 
Horn (1942); Hotelling (1940); Hotelling and Pabst (1936); Judd (1936); 
Kaarsemaker and Wijngaarden (1952); Kendall (1938, 1942b, 1945, 1947, 
1948a, 1948b, 1949); Kendall, Kendall and Smith (1938); Lehmann (1953); 
Lindeberg (1925, 1929); Lyerly (1952); McNemar (1947); Mood (1950); Moran 
(1948a, 1948b, 1950a, 1950b, 1951b); Moroney (1951); K. R. Nair and Banerjee 
(1942); K. R. Nair and Shrivastava (1942); Neyman (1951); Neyman and Scott 
(1951, 1952); Nimeroff (1952); Norton (1946); Olds (1938a); Olmstead and 
Tukey (1947); K. Pearson (1916b); Pitman (1937b); Pridmore (1944); Quenou- 
ille (1952); Schultz (1945); Schiitzenberger (1948a); Scott (1950, 195i); Sillitto 
(1947); Silverstone (1950); B. B. Smith (1941); Spearman (1904); Spurr (1951); 
Steffensen (1941); “Student” (1921); Theil (1950a, 1950b, 1951); Thornton 
(1943); Tippett (1952); Treloar (1942); Wald (1940); Weichelt (1946); Whit- 
field (1947, 1949, 1950); Wolfowitz (1943); Woodbury (1940); Yule and Kendall 
(1950). 


K. Comparative Studies (44) 


Baker (1946); Bartlett (1935); Benson (1949); Bradley (1952a, 1952b); Camp 
(1946); Chung (1946); Cochran (1936); F. N. David (1949); F. N. David and 
Johnson (195la, 1951b, 195lc, 1952); Davies and Pearson (1934); Eden and 
Yates (1933); Eisenhart (1947); Festinger (1943); Finch (1950); Gayen (1949, 
1950b); Geary (1936, 1947); Godwin (1949a); Hastings, et al. (1947), Hirsch- 
mann (1943); Hotelling (1947); H. L. Jones (1953); Laderman (1939); Moriguti 
(1951); A. N. K. Nair (1941); K. R. Nair (1950); E. S. Pearson (1931, 1937, 
1938b, 1950b); E. S. Pearson and Adyanthiaya (1929); E. S. Pearson and Mer- 
rington (1951); Perlo (1933); Plackett (1947); Rider (1929); Shone (1949); 
Steinhause (1948); Tukey (1948b, 1950). 


L. Systematic Statistics (127) 


Baker (1946); Banerjee (1952); Barnard (1943a); Belz and Hooke (1953) ; Ben- 
derskii (1952); C. A. Bennett (1952, 1953); Benson (1949); Bhate (1951); Burr 
(1952); Cadwell (1952); Chandler (1952); Chandra Sekar and Francis (1941); 
Chaplin (1880, 1882); Cochran (1941); R. H. Cole (1949, 1951); Cox (1948); 
A. T. Craig (1932); Cramér (1946); Daly (1946); Darling (1952a, 1952b); H. A. 
David (1951); Davies and Pearson (1934); Dodd (1923); DuBois (1935); Dwass 
(1952); Egudin (1947); Eisenhart, Deming and Martin (1948); Epstein (1948, 
1949, 1951, 1952); Epstein and Sobel (1952a, 1952b); Feller (1951); Fisher and 
Tippett (1928); Fisher and Yates (1948); Fréchet (1927); Gartsteln (1948); 
Gnedenko (1943); Godwin (1948, 1949b); Griffith (1920); Gumbel (1935, 1944, 
1946, 1947, 1949); Gumbel and Keeney (1950a, 1950b) ; Gumbel and Greenwood 
(1951); Hald (1952); Harris (1952); Hartley (1942, 1950a, 1950b); Hartley and 
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Pearson (1951); Hastings, et al. (1947); Hayashi (1950); Hoeffding (1953) ; Hojo 
(1932, 1933); Homma (1951); Irick (1952); A. E. Jones (1946); H. L. Jones 
(1948); Kawata (1951); Keen, Page and Hartley (1953); Kendall (1940); Lord 
(1947); May (1952); McIntyre (1952); McKay (1935); McMillan (1949); 
Melzler (1949, 1950); von Mises (1936); Mood (1941); Moriguti (1951, 1952b); 
Moshman (1952); Mosteller (1946); K. R. Nair (1948a, 1948b, 1952); K. R. 
Nair and Shrivastava (1942); Ogawa (1951); Olds (1935); Patnaik (1950); 
E. S. Pearson (1932, 1937, 1938b, 1952); E. 8. Pearson and Haines (1935); 
E. S. Pearson and Hartley (1943); K. Pearson (1907, 1920); Peirce (1926); 
Pillai (1948, 1950, 1951, 1952); Pillai and Ramachandran (1953); Rider (1950, 
195la, 1951b, 1953); Sandelius (1952); Schiitzenberger (1948b, 1948c); Shone 
(1949); Sillitto (1951); Smirnov (1935, 1949); Terpstra (1952b); H. A. Thomas 
(1948); Thompson (1938); Tukey (1948c); Walsh (1946a, 1946b); Watson 
(1952); Weichelt (1946); Woodruff (1952); Yamanouchi (1949). 


M. Scaling (37) 


Baten (1946); Baten and Trout (1946); Bliss, Anderson and Marland (1943) ; 
Bradley (1953); Bradley and Duncan (1950); Bradley and Terry (1951a, 1951b); 
Chapman (1934, 1935); Cramér (1946); Crist and Seaton (1941); Durbin (1951); 
Eddison, et al. (1951); Ehrenberg (1952); Eysenck (1939); Friedman (1940); 
J. A. Greenwood (1943); M. L. Greenwood and Salerno (1949); Greville (1941) ; 
Grossnickle (1942); Guttman (1946); van der Heiden (1952); P. O. Johnson 
(1950) ; Judd (1936); Kendall (1942a, 1948a); Kendall and Smith (1939c, 1939d) ; 
Moran (1947b); Mosteller (195la, 1951b, 1951c); Scheffé (1952); Schuyler 
(1948); Stouffer, et al. (1950); Stuart (1951); Terry, Bradley and Davis (1952). 


N. Distribution Theory (383) 


Andersen (1949); T. W. Anderson (1943); T. W. Anderson and Darling (1952) ; 
André (1879, 1883a, 1883b); Arfwedson (1951); Armitage, Baines and Lindley 
(1944); Bachelier (1912); Banerjee (1952); Bateman (1948); Baticle (1951); Bat- 
tin (1942); Benderskii (1952); Belz and Hooke (1953); Bendersky (1948); 
Bergstrém (1949, 1951); Berry (1941); Bertrand (1875, 1907); Besson (1920); 
Bickerstaff (1947); Bienaymé (1875); Bilham (1926); Birnbaum (1948, 1950); 
Birnbaum and Tingey (1951); Blomqvist (1950, 1951); Borel (1933); Bose 
(1946, 1950); Bottema and van Veen (1943, 1946); G. W. Brown and Mood 
(1951); Burr (1952); Cadwell (1952); Carlton (1946); Chandler (1952); Chandra 
Sekar and Francis (1941); Chapman (1934, 1935); Chung (1948, 1949); Chung 
and Feller (1949); Clark (1933); Cochran (1938, 1942, 1950); Cramér (1928a, 
1928b, 1946); Daniels (1951); Daniels and Kendall (1947); Dantzig (1939); 
Darling (1951, 1952a, 1952b); F. N. David (1934, 1938, 1939, 1947a, 1947b, 
1948); F. N. David and Johnson (1948, 195la, 1951b, 1952); S. T. David, 
Kendall and Stuart (1951); Dixon (1940); Dodd (1923); Domb (1947, 1952); 
Donsker (1952); Doob (1949, 1953); Drion (1952); Dubin and Stuart (1951); 
Eggleton and Kermack (1944); Ehrenberg (1952); Eisenhart (1937, 1938); 
Eisenhart, Deming and Martin (1948); Epstein (1949); Erdés and Kac (1947) ; 
Esscher (1924); Essen (1942); Evans (1942); Federighi (1950); Feller (1945, 
1948, 1950, 1951); Finney (1947); Fisher (1922, 1923, 1925, 1926a, 1926b, 1928) ; 
Fisher and Tippett (1928); Fisher and Yates (1948); Fraser (1950); Fréchet 
(1927); Freund (1951); Friedman (1937, 1940); A. Gardner (1952); R. S. Gardner 
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(1950); Gartstein (1948); Gayen (1951); Geary (1936); Ghosh (1948); Gilde. 
meister and Waerden (1943); Gleissberg (1945a, 1947); Gnedenko (1943, 1952); 
Gnedenko and Korolyuk (1951); Gnedenko and Mihalevit (1952a, 1952b); 
Gnedenko and Rvacéva (1952); Godwin (1948); Gold (1929); Gontcharoff 
(1942, 1943, 1944); Good (1953); Goodman (1952); Goudsmit (1945); Grant 
(1952); J. A. Greenwood (1938, 1940, 1946); R. E. Greenwood (1953); Greville 
(1938, 1941, 1944); Gumbel (1935, 1944, 1946, 1947, 1949); Gumbel and Green- 
wood (1951); Gumbel and Keeney (1950a, 1950b); Gumbel and Schelling (1950); 
Gupta (1950); Haden (1947); Hadwiger (1943, 1946); Haldane (1937, 1939, 
1940); Haldane and Smith (1948); Hannan (1950); Harris (1952); Hartley (1942, 
1950a, 1950b); Hastings et al. (1947); Hayashi (1950); Hemelrijk (1952c); van 
der Heiden (1952); Hoeffding (1947, 1948a, 1948b, 1951a, 1953); Hoeffding and 
Robbins (1948); Hoel (1938); Hojo (1932); Hotelling and Pabst (1936); Hous- 
ner and Brennan (1948); Hsu (1945); Huntington (1937); Irick (1952); Ising 
(1925); N. L. Johnson (1948); Juncosa (1949); Kac (1949); Kanellos (1948); 
Kaplan (1949); Kaplansky (1945a); Kaplansky and Riordan (1945, 1946); Katz 
(1952); Kawata (1951); Keeping (1952); Kemperman (1950); Kendall (1938, 
1941, 1942a, 1942b, 1945, 1947, 1948a, 1949); Kenaali, Kendall and Smith 
(1938); Kendall and Smith, 1939c, 1939d); Kermack and McKendrick (1937a, 
1937b, 1938); Kimball (1947, 1950); Kolmogorov (1933b, 1941); Krishna Iyer 
(1947, 1948a, 1948b, 1949a, 1949b, 1950a, 1950b, 1950c, 1951b, 1952a, 1952b); 
Krishna Iyer and Sukhatme (1949); Kruskal (1952); Kullback (1939); Kuznets 
(1929); Levene (1946b, 1952); Levene and Wolfowitz (1944); Lévy (1936); 
Loéve (1945, 1950); Lyerly (1952); Mack (1948); MacMahon (1915, 1916); 
Madow (1948); Mahalanobis (1944); Malmqvist (1950); Maniya (1949); Mann 
(1945a, 1945b); Mann and Whitney (1947); Marbe (1926, 1934); Marshall 
(1951); Massey (1950b, 1951b, 1951c); Masuyama (1951); Mathisen (1943); 
Mauldon (1951); McCarthy (1947); McKay (1935); McMillan (1949); Melizler 
(1950); Mihalevit (1952); Mihoc (1943, 1949); von Mises (1936, 1947); Mood 
1940, 1941, 1950); G. H. Moore and Wallis (1943); P. G. Moore (1949); Moran 
(1947a, 1947b, 1947c, 1948a, 1948b, 1948c, 1950b, 1951la, 1951b); Moriguti 
(1951); Mosteller (1941, 1946, 1948); Mosteller and Tukey (1949, 1950); Mul- 
lemeister (1945); A. N. K. Nair (1942); K. R. Nair (1937, 1940a, 1940b, 1948a, 
1948b); Neyman (1935, 1937); Noether (1949, 1950); Olds (1935, 1938b, 1949, 
1952); Olmstead (1946); Olmstead and Tukey (1947); Patnaik (1949); E. 8. 
Pearson (1952); E. S. Pearson and Hartley (1943); K. Pearson (1900, 1911, 
1920, 1933b); Pillai (1952); Pitman (1937c); van der Plank (1947); Pollaczek 
(1952); Rajalakshma (1943); Rider (1950, 195la, 1951b); Robbins (1944b, 
1945); Rosenblatt (1952b); Sakamoto (1943); Sandelius (1952); Santalé (1947); 
Sawkins (1947); von Schelling (1939); Schrutka (1941); Schiitzenberger (1948a, 
1948b); Schuyler (1948); Seal (1948); Shanawany (1936); Sherman (1950); 
Shone (1949); Silberstein (1945); Sillitto (1947); Silverstone (1950); Singh 
(1952); Smirnov (1935, 1936, 1937, 1939a, 1939b, 1947, 1949); Stevens (1937, 
1938, 1951); Stewart (1941); Stuart (1950, 1951, 1952); “Student” (1921); 
B. V. Sukhatme (1949a, 1949b, 1951); Terpstra (1952a, 1952b); Thompson 
(1936, 1938); Tukey (1948c, 1949a, 1949c); Ville (1943a); Vora (1951); Votaw 
(1946) ; van der Waerden (1952); Wald (1947); Wald and Wolfowitz (1940, 1943, 
1944); Waldapfel (1943); H. M. Walker (1929); Wallis (1939); Wallis and Moore 
(1941a, 1941b); Walsh (1949a); Waschakidse (1938); Welch (1937, 1938); White 
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(1952); Whitworth (1945); Wilks (1935, 1943); Wishart and Hirschfeld (1936); 
Wolfowitz (1942, 1944a, 1944b); Woodbury (1940); Young (1941). 


0. Applications (89) 


Azorin and Wold (1950); Baillie (1946); Baker and Guilbert (1942); Barnard 
(1943a, 1943b) ; Bates and Neyman (1951); C. A. Bennett (1951); Bilham (1926); 
B. Brown (1948); W. R. J. Brown (1952); Brownlee (1924); Cameron (1952) ; 
Campbell (1942); Chaplin (1880, 1882); Charley (1950, 1952); Clark (1933, 
1934); Cochran (1936, 1938); L. C. Cole (1945); Cowles and Jones (1937); 
Cramér (1928b); Crist (1940); Crist and Seaton (1941); van Dantzig (1952); 
D. G. Davis (1952); Dawson, Duehring and Parks (1947); Eisenhart (1935); 
Eisenhart and Wilson (1943); Elfving and Whitlock (1950); Epstein (1948); 
Esscher (1924); Eysenck (1939); Fiedler, Hartman and Rudin (1952); Gage 
(1943); Gold (1929); M. L. Greenwood and Salerno (1949); Griffith (1920); 
Grineberg and Haldane (1937); Guttman (1946); Hald (1952); Haldane and 
Smith (1948); Herdan (1949, 1950); H. E. Jones (1937); Judd (1936); Keeping 
(1952); Keen, Page and Hartley (1953); Kendall and Smith (1938, 1939b); 
Kermack and McKenderick (1937a); P. C. V. Lesser (1933); Lewis and Burke 
(1949); Lollar (1952); Lombard and Doering (1947); Lowry (1951); Mainland 
(1948); Moses (1952a); Mosteller (1941); K. R. Nair (1938); Nimeroff (1952); 
Nybdlle (1936); Olmstead (1940); K. Pearson (1900, 1911, 1932a); K. Pearson 
and Heron (1913); Peirce (1926); Pridmore (1944); Reynolds (1952); Scheffé 
(1952); Schultz (1945); Schiitzenberger (1948b, 1948c); Sealy (1943); Shewhart 
(1931, 1941); Stouffer, et al. (1950); Terry, Bradley and Davis (1952); H. A. 
Thomas (1948); Tintner (1952); Uhler (1952); White (1952); Whitfield (1950); 
Wilcoxon (1946); Yule (1922, 1938). 


P. Tables (228) 


O. N. Anderson (1935); T. W. Anderson and Darling (1952); Azorin and Wold 
(1950); Baker (1946); Bateman (1948); C. A. Bennett (1952); Berge (1932, 
1937); Bickerstaff (1947); Birnbaum (1952); Birnbaum and Tingey (1951); 
Blomqvist (1950, 1951); Cadwell (1952); Camp (1922, 1948); Chandler (1952) ; 
Chapman (1935); Clopper and Pearson (1934); L. C. Cole (1945); R. H. Cole 
(1949) ; Cox (1948); Daly (1946); F. N. David (1934, 1939, 1947a, 1947b, 1949, 
1950a); H. A. David (1951); S. T. David, Kendall and Stuart (1951); Dixon 
(1940); Dixon and Mood (1946); Dodd (1923); Drion (1952); DuBois (1939); 
Duncan (1952); Elderton (1901); Epstein (1952); Epstein and Sobel (1952a); 
Eysenck (1939); Festinger (1946); Finch (1950); Finney (1948); Fisher (1950); 
Fisher and Tippett (1928); Fisher and Yates (1948); Fix (1949): Fix and 
Hodges (1952); Friedman (1937, 1940); Fulcher (1942); A. Gardner (1952); 
Gayen (1949, 1950a, 1950b, 1951); Geary (1947); Gleissberg (1945a); God- 
win (1949b); Gordon et al. (1952); Grant (1952); R. E. Greenwood (1953); 
Guilford (1941); Gumbel (1935, 1947, 1949); Gumbel and Greenwood (1951); 
Gumbel and Keeney (1950a, 1950b); Gumbel and Schelling (1950); Gupta 
(1950); Hald and Sinbaek (1950); Haldane and Smith (1948); Harris (1952); 
Hartley (1950a, 1950b); Hartley and Pearson (1951); Hastings, et al. (1947); 
Hemelrijk, et al. (1951); Hoeffding (1948b); Hojo (1933); Huntington (1937); 
A. E. Jones (1946); H. L. Jones (1948, 1953); Jurgensen (1947); Kaarsemaker 
and Wijngaarden (1952); Kaplansky (1945a); Katz (1952); Keeping (1952); 
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Kendall (1938, 1948a); Kendall, Kendall and Smith (1938); Kendall and Smith 
(1939a, 1939c, 1939d); Kolmogorov (1941); Krishna Iyer (1950a); Kruskal and 
Wallis (1952); Kunisawa, Makabe and Morimura (1951); C. E. V. Lesser (1942); 
Levene (1952); Lord (1947); Lyerly (1952); Mahalauobis, et al. (1933); Main- 
land (1948, 1952); Mainland and Murray (1952); Mann (1950); Mann and 
Whitney (1947); Massey (1950b, 195la, 1951b, 1952b); Mathisen (1943); 
Mathematical Centre (1952); May (1952); McCarthy (1947); McIntyre (1952); 
McKay (1935); G. H. Moore and Wallis (1943); G. H. Moran (1948b, 1950b, 
1951b); Moriguti (1951, 1952a, 1953); Moshman (1952); Moses (1952); Mostel- 
ler (1941, 1946, 1948); Mosteller and Tukey (1950); Murphy (1948); K. R. Nair 
(1940b, 1948a, 1948b, 1952); Neyman and Pearson (1930); Norton (1946); 
Ogawa (1951); Olds (1935, 1938a, 1949); Olmstead (1946); Olmstead and Tukey 
(1947); Patnaik (1948, 1949, 1950); E. S. Pearson (1932, 1947); E. S. Pearson 
and Hartley (1943); E. S. Pearson and Merrington (1948, 1951); K. Pearson 
(1900, 1919, 1920, 1931, 1933b, 1934); Pillai (1950, 1951, 1952); Pitman (1937c); 
Plackett (1947); Rider (1929, 1951b); Rijkoort (1952); Schmidt (1934); Schiit- 
zenberger (1948c); Shanawany (1936); Sillitto (1947, 1949, 1951); Silberstone 
(1950); Smirnov (1947, 1948); Stevens (1937); Stewart (1941); Stuart (1952); 
B. V. Sukhatme (1951); P. V. Sukhatme (1938); Swaroop (1938); Swed and 
Eisenhart (1943); Swineford (1946, 1948); Terry (1952); H. A. Thomas (1948); 
M. Thomas (1951); Thornton (1943); Tippett (1927); Tukey (1948c, 1949b, 
1949c); van der Vaart (1950b); van der Waerden (1952); Wald (1943); Wallis 
and Moore (194la, 1941b); Walsh (1946a, 1946d, 1949a, 1949b, 1949c, 1950b, 
1950c, 1951b); Welch (1937, 1938); Westenberg (1948, 1950a); White (1952); 
Whitfield (1949); Whitney (1951); van Wijngarden (1950); Wilcoxon (1945, 
1946, 1947, 1949); Wilks (1940, 1942); Williams (1950); Wilson (1952); Winsten 
(1946); Woodbury (1940); Young (1941); Zubin (1939). 


X. Miscellaneous (28) 


Bejar (1952); Birnbaum (1948); Birnbaum and Zuckerman (1944); Chakra- 
barti (1946, 1947); Dyson (1943); Fréchet (1947); Gordon, et al. (1952); Gutt- 
man (1948a); Herdan (1953); Hornich (1941); Kaplansky (1945b); Kerawala 
(1948); Masuyama (1951); von Mises (1939a); Moriguti (1952a, 1953); Mostel- 
ler and Tukey (1949); E. S. Pearson (1950a); Picard (1951); Plackett (1947); 
Reiersol (1944); Shohat (1929); Smirnov (1947); Tukey (1946); Wallis (1942); 
Wilkins (1944); Winsten (1946). 





ERRATA 


Readers and authors are invited to submit corrections to papers pub- 
lished in any previous issue. These will be published each year, in the 
December issue. 


Hyrenius, Hannes, ON THE UsrE or RanGEs, Cross-RANGES AND Ex- 
TREMES IN COMPARING SMALL SAMPLES, Vol. 48, No. 263 (September 
1953), 534-45. 

On page 536, in equation (9b), a factor T—* is missing. 

Furthermore, the sum in the equation can be evaluated, simplifying 
the formula to 


(Ni — 1)(Ni — 1)1N2! 
T.(N, + Nz — 1)! 


Kruskal, William H., and Wallis, W. Allen, Use or RANKs IN ONE- 
CRITERION VARIANCE ANALYsIs, Vol. 47, No. 267 (December 1952), 
583-621. 

1. In Section 5.3 of [a] we should have mentioned, had we known of 
it, a 1952 article by van der Reyden [b]. Van der Reyden develops Wil- 
coxon’s two-sample test independently, and tabulates critical values of 
R at two-tail significance levels of 5, 2, and 1 per cent for all sample 
sizes such that 10S N $30 and 2 or 3Sn<12, the lower limit for n be- 
ing 2 at the 5 per cent level and 3 at the other levels. 

Since van der Reyden’s tables for the 5 and 1 per cent levels cover 
much the same ground as White’s [a, Sec. 5.3.5], we have compared the 
two tables wherever possible and have corresponded with van der Rey- 
den, who in turn has corresponded with White, about discrepancies be- 
tween them. The upshot of this correspondence is: (7) There are nu- 
merous and fairly sizeable errors in the columns for n=11 and 12 of 
the three van der Reyden tables; van der Reyden has very kindly 
sent us the corrected values, but these have not yet been published.? 
(it) In addition there are twelve scattered discrepancies, each of one 
unit in R, between the van der Reyden and the White tables at the 5 
and 1 per cent levels; in all of these van der Reyden appears to be cor- 





(9b) f(T) = 





1 For comments embodying or leading to these corrections and additions we are indebted to K. A. 
Brownlee (University of Chicago), P. J. Rijkoort (Royal Netherlands Meteorological Institute), 
L. J. Savage (University of Chicago), T. J. Terpstra (Mathematical Center, Amsterdam), D. van der 
Reyden (Tobacco Research Board of Southern Rhodesia), and C. White (University of Birmingham). 

2 White, in a letter to us, points out that approximately correct values for columns 11 and 12 of 
van der Reyden’s tables may be obtained as follows: move the entries in the Z columns down one row 
to find the approximately correct L values; then to obtain the corresponding U values use the relation- 
ship U=n(m+1)—L (van der Reyden’s notation). This applies to all three levels of significance. 


907 





908 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


rect, White’s values leading to probabilities of Type I error slightly 
greater than intended. 

2. Reference [44] should have been listed as shown at the end of this 
note. Had this been available to us in time we should have included the 
following description in Section 5: 

Rijkoort’s C-Sample Test. Rijkoort [44] has proposed the C-sample 
test which rejects when 


S= dD n*[R: — 3(N + 1)}? 


is large. The use of S is not equivalent to the use of H unless all n,’s are 
equal; in that case the relationship is S= N*(N+1)H/(12C). In Rij- 
koort’s paper k is used for our C, and when all the n,’s are equal m is 
used to denote their common value. 

Rijkoort tabulates the distribution of S for the following cases with 
all n,’s equal: C=3, N=6, 9, and 12; C=4, N=8; C=5, N=10. He 
also gives the upper tails of the distributions of S (down at least to the 
upper 5 per cent points) for C=3, N=15 and for C=4, N=12. In ad- 
dition, he gives approximate upper 5 per cent critical values of S for C 
from 3 through 10, and for (equal) n,’s from 2 through.10. True critical 
values are given in some cases. 

We have compared Rijkoort’s distributions with ours when C=3 
and have found a few discrepancies. Correspondence with Rijkoort 
about these leads to the following corrections to the cumulative prob- 
abilities in Rijkoort’s tables. (We omit corrections of only a single unit 
in the last decimal place.) 


k = 3, m = 2: S = 18, P = 0.467. P should be 0.400. 
= 3,m = 2: S = 14, P = 0.600. P should be 0.533. 
= 3,m = 5: 554 S S S 654. The tabulated P’s are all about 0.002 
too low. 
In addition, Rijkoort has kindly sent us the following corrections to his 
table: 
k=4,m=2:S = 74, P = 0.040. P should be 0.038 
k = 5, m = 2: S = 128, P = 0.0847. P should be 0.0910. 
= 5, m = 2: S = 122, P = 0.1208. P should be 0.1280. 
Finally, in Rijkoort’s table of 5 per cent critical points the number 


pair 558-566 in column 3 and row 5 III should be 566-578. 
3. On p. 587 of [a], in the fourth line after formula (1.5), the word 
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“essentially” may be misleading. What we meant to indicate by this 
heuristic statement is that without the factor (VN—1)/N, H would be 
like a sum of squared standardized deviations in which the finite-popu- 
lation corrections to the variances and the correlation between means 
had been disregarded. The factor (N—1)/N is the net result of giving 
due regard to these two points. 

4. In footnote 6, p. 591, it should have been stated that the compari- 
son between use and nonuse of the continuity correction in the two- 
sample test pertains to the one-tail version. For the two-tail version, 
use of the continuity correction is advantageous only when the proba- 
bility is 0.04 or more. 

5. To avoid an ellipsis, the following phrase should be added on 
p. 593, in line 14, just before the semicolon: “thereby altering the value 
of H.” 

6. The errors listed on the next page have been found in Table 6.1, 
most of them as a result of correspondence with T. J. Terpstra.? These 
corrections affect Figure 6.3 of [a] at a few points, but do not change 
the general patterns of deviations between true and approximate prob- 
abilities shown by Figure 6.3. 

7. We take this opportunity to call attention to a paper by Rijkoort 
and Wise [c] which has appeared since [a]. This presents new approxi- 
mations to the sampling distributions for Friedman’s test [a, Sec. 5.2] 
and for the H test [a] if all samples are of the same size (in which case 
the H test and Rijkoort’s test [44] are equivalent). The approximations 
are based upon a series expansion for the inverse of the incomplete 
Beta integral. Nomograms facilitating use of the new approximations 
are given (in both cases) for significance levels from 1 to 10 per cent, for 
3 to 20 samples, for sample sizes of 1 to 30. 

8. We should also like to call attention to a recent paper by van der 
Waerden [d]. In this paper the power of the Wilcoxon test in the normal 
case is discussed, and an alternative nonparametric test is proposed 
that is more powerful than the Wilcoxon test in the normal case. 


REFERENCES 


[2] Kruskal, William H., and Wallis, W. Allen, “Use of ranks in one-criterion 
variance analysis,” Journal of the American Statistical Association, 47 (1952), 
583-621. 

[b] van der Reyden, D., “A simple statistical test,” Rhedesia Agricultural 
Journal, 49 (1952), 96-104. 





2 We are indebted to Jack Nadler for making the computations involved in rechecking Table 6.1. 
Almost all of the errors had occurred at one stage of the computations, and Mr. Nadler recomputed 
this stage completely. 
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CORRECTIONS TO TABLE 6.1 


In each pair of lines, the first repeats the line from Table 6.1 with the 
erroneous entries italicized, and the second gives the corrections. 








Sample Sizes 


probability 


Approximate minus true 








nN ne ns 


r 
(Linear 
Interp.) 


B 
(Normal 
Interp.) 














> 22. 22 2S 2s 


- 22 3? 





t+ 44 +4 


. 167 
- 100 


.020 
.024 


.012 
.013 


.O11 
.012 


.005 
-004 


.008 





267 
. 209 


.002 
-010 


002 
-004 
-002 


.010 
-009 


.013 
.012 


.003 
-002 


.006 
.004 


.002 








ERRATA 911 


44] Rijkoort, P. J., “A generalization of Wilcoxon’s test,” Proceedings, Kon- 
inklijke Nederlandse Akademie van Wetenschappen, Series A, 55 (1952), 
394-404. 

c] Rijkoort, P. J., and Wise, M. E., “Simple approximations and nomograms 
for two ranking tests,” Proceedings, Koninklijke Nederlandse Akademie van 
Wetenschappen, Series A, 56 (1953), 294-302. 

{d) van der Waerden, B. L., “Order tests for the two-sample problem and their 
power,” Indigationes Mathematicae, 14 (1952), 453-58. 


Rider, Paul R., THe DistrRIBUTION or THE PropucT or RANGES IN 
SAMPLES FROM A RECTANGULAR Popu.ation, Vol. 48, No. 263 (Sep- 
tember 1953), 546-9. 

On page 549, in formula (13) the factor 2 should be removed from the 
denominator. 


Robson, D. S., and King, A. J., MuntipLe SAMPLING oF ATTRIBUTES, 
Vol. 47, No. 258 (June 1952), 203-15. 

The estimate of variance, equation (6), pages 205 and 215, should 
read 


MN m—1 Mn u m —1 
n—m me 


nm ig Miz — 1 





~ 1rM—wN ..om.. N- {-0M,. 
re = —| ste a OD 3 





A proof of the unbiasedness of this estimator may be deduced from the 
appendix by noting, in addition, that 


Dwyer, Paul S., and Waugh, Frederick V.,On Errors 1n Matrix IN- 
VERSION, Vol. 48, No. 262 (June 1953), 289-319. 

Dr. W. Duane Evans and Mr. John C. H. Fei have called our atten- 
tion to the need for modifying Section VII of our paper. In that Sec- 
tion of the paper we considered the inversion of a given Leontief ma- 
trix, L =] —A, where each element of A is non-negative. We assumed 
that any element of L might be in error by 100 k per cent, and we pro- 
posed a very simple upper bound to the discrepancies between elements 
of the given matrix and the true matrix. Unfortunately equation (7.5) 
does not provide an upper bound to such discrepancies. 
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As we stated in Section VI, maximum discrepancies in all elements 
of the inverse of the Leontief matrix would occur if each error in the 
given matrix were negative with the absolute value of its bound. Maxi- 
mum discrepancies do not occur with (1—k)(I—A), since while the 
diagonal errors, —kI, would be negative, yet the non-diagonal errors, 
kA, would be positive; but with 


(—-k)I-(l+khA=L—kI +A) =L— Kd. 


The extreme inverse matrix may be obtained by calculating (Z —kH)-, 
Alternatively the extreme error of L~! can be computed from equation 
(3.4) of our paper. This becomes 


D = k(L“HL~)[I + (kHL~) + (KHL)? + --- J 


with HI-!=2L-1—I, 
If the diagonal terms are not subject to error the extreme inverse is 


obtained from J—(1+)A. The discrepancy formula above holds with 
H replaced by A and AL~“'=L-'—I. 


Brown, J. A. C., Houthakker, H. S., and Prais, S. J., ELecrronic Com- 
PUTATION IN Economic Statistics, Vol. 48, No. 263 (September 1958), 
412-28. 

We are indebted to George W. Thomson of the Ethyl Corporation, 
Detroit, for drawing our attention to some errors in the illustrative ex- 
ample quoted in our article. The errors are associated with the con- 
vergence of the iterative process given on p. 416. The value of 3(3 — 5) 
there given is the value for which convergence is effected after two 
iterations, but this does not mark the boundary for one-sided converg- 
ence. The latter value is }. A similar correction should be made on p. 
422. 

There is a further mistake on p. 416 at step (4) where the words “pos- 
itive” and “negative” have to be interchanged; and a corresponding 
change in the interpretation of the function letter En. 

We apologize to readers who may have been confused by these er- 
rors. 
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BOOK REVIEWS 


Statistics in Psychology and Education. Fourth Edition. H. E. Garrett. New 
York: Longmans, Green and Company, 1953. Pp. xii, 460, $5.00. 


Freperic M. Lorp, Educational Testing Service 


HE fourth edition of this widely used text represents a considerable re- 
y eaters and revision of the previous one. Several recent references are 
listed in footnotes to the text. Material on analysis of variance has been ex- 
panded into a separate chapter, which includes an illustrative example of 
analysis of covariance. Other new materials include a fuller treatment of 
Fisher’s z; methods of drawing a random sample; stratified sampling; the 
fourfold point correlation; factors determining selection of tests in a battery; 
and one-tailed tests of significance. 

The first six chapters will serve as a good text for students with minimal 
mathematical aptitude who are to learn to compute such statistics as means, 
standard deviations, percentiles, normal curve areas, and Pearson correla- 
tion coefficients. For students who wish to go beyond this, a text that is 
more nearly correct and complete in its statements of logical and statistical 
inferences would be preferable, providing it is not beyond the student’s 
intellectual grasp. 

There is little occasion to take serious exception to the material in the 
first six or seven chapters. In the important section on Standards of Accuracy 
in Computation, however, the reader may reach the erroneous conclusion 
(p. 23) that a square root usually has less significant figures than (often one- 
half) the number of significant figures in the number whose square root is 
extracted. (The illustration in the text should be corrected to show that 
159.5600 = 12.631706 (sic) with an error of no more than .0000022.) 

A very worthwhile achievement of the fourth revision is the removal 
(primarily from chapters 8-10, dealing with sampling, standard errors, and 
testing experimental hypotheses) of the serious confusion, pervading the 
third edition, between a priori and fiducial probability. Only the last sentence 
of the final chapter escaped revision: “This correction gives the value which 
R would most probably take in the population from which our sample was 
drawn.” 

Some of the more serious of the remaining errors and misstatements, 
mostly relating to the logic of statistical inference, are listed below: As a 
criterion for general use in judging randomness of sampling it is suggested 
that “If samples are fairly consistent, therefore, they are presumably random 
unless subsequent examination reveals a common bias.” (p.:205). Also, if 
we can assume the trait to be normally distributed, then “symmetry of dis- 
tribution becomes an excellent criterion of sample adequacy.” (p. 204). 
After making a certain test of the significance of the difference between 
means, “ ,.. we retain the null hypothesis and conclude with confidence 
that, on the evidence, there is no real difference between Norwegians and 
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Belgians on the ‘combined scale’... ” (p. 216). A two-tailed test “should 
always be used when, in accordance with the null hypothesis, our two groups 
have conceivably been drawn from the same population...” (p. 217), 
“Forty-two salesmen have been classified into three groups—very good, 
satisfactory, and poor—by a concensus of sales managers... how many 
of the 42 salesmen may be expected to fall in each category on the hypothesis 
of a normal distribution [may be determined from a table of normal curve 
areas] by dividing the baseline of a normal curve (taken to extend over 6c) 
into 3 equal segments of 2¢ each.” (p. 257). 

A misstatement about statistical technique is that “x? is not stable when 
computed from a table in which any experimental frequency is less than 
5” (p. 258). (It is the theoretical frequencies that are pertinent.) 

The chapter on The Reliability and Validity of Test Scores does not pro- 
vide as clear a discussion of the different kinds of reliability as could be 
desired. Exactly two pages in this chapter, incidentally, are devoted to a 
consideration of item analysis. 

Special favorable mention should be made of the material on Type I and 
Type II errors, and of much of the material on the phi coefficient, on one- 
tailed significance tests, on scaling, and on the multiple correlation coeff- 
cient. 

In the introduction to the first edition in 1926, Woodworth indicates that 
the statistician for whom the book is intended is “he who has selected the 
scientific or practical problem.’... He selects the statistical tools to be 
employed . . . [he] must have a discriminating knowledge of the kit of tools 
which the mathematician has handed him.” In the reviewer’s opinion, what- 
ever may have been the case at the time the foregoing was written, the book 
is not appropriate for today’s statistician who answers to, or today’s student 
who is to be trained to answer to, the foregoing description. The book makes 
no serious effort to specify the assumptions underlying many of the state- 
ments made. In the discussion of regression and prediction, for example, it is 
frequently asserted that the predicted value is the “most likely value” with- 
out any suggestion that normality or some other special property of the 
bivariate distribution is being assumed. Standard error formulas are given 
and their use illustrated often without indicating to the reader that 1) 
normality has been assumed, and 2) the formulas can only be safely used 
with large samples. 

The chapter on Further Methods of Correlation particularly shows the 
tendency to provide a ready recipe for the calculation of any desired statistic, 
without adequately explaining the meaning of the statistic in question. For 
example, it would be useful to point out that the point biserial correlation is 
simply the Pearson product-moment correlation that would be found if 
any two arbitrary numerical values (e.g., 0 and 1, or 7 and 19) were assigned 
to represent the dichotomous variable, and the usual formula for the product- 
moment correlation were then applied. The reader who wishes to apply in 
actual work the techniques of analysis of variance and covariance, the paired 
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comparison scaling technique, or the biserial, tetrachoric, or contingency 
correlation methods should refer to some book giving a more thorough treat- 
ment than is possible in a text designed primarily for other purposes. 


Sources of Wage Information: Employer Associations. N. Arnold Tolles and 
Robert L. Raimon (New York State School of Industrial Relations at Cornell 
University, Ithaca, New York). “Cornell Studies in Industrial and Labor 
Relations,” Volume III, Spring 1952, pp. xvii, 351. Paper. $3.00. 


M. I. GersHENSON, California Department of Industrial Relations 


art I of this monograph presents individual digests of most of the wage 
dain conducted by employer associations in the United States. The 
summary descriptions of each of the 220 wage surveys conducted by 120 
employer associations include such information as starting date of the 
survey, how frequently the survey is made, what industries and areas are 
surveyed, the number of plants or companies participating, the sample 
coverage, types of data collected, and types of statistical measures used to 
summarize the information; also, types of “fringe” items, method used to 
collect the original information, and form of publication or distribution of 
the data. 

Unfortunately any ambitious listing such as this, which by the nature of 
the project requires a great deal of time, becomes out of date even before 
it goes to press. No wage figures are given and the authors stress the fact 
that the listing of an association does not necessarily imply that the reader 
may obtain any wage figures from that association. As a reference of avail- 
able source data on wages, the listings are certainly valuable but the mono- 
graph may be frustrating to those seeking specific wage rate information 
since a large number of the entries indicate that the survey results are avail- 
able only to members of the association. 

An alphabetical list of employer associations, a regional finding list, an 
industrial finding list, and a finding list of area-oriented surveys are included 
in Part I together with a technical note on definitions, procedures followed, 
and problems encountered. 

The most valuable contribution to the field is contained in Part II which 
presents a detailed analysis and appraisal of wage surveys conducted by 
employer associations. The authors assess very frankly the elements of 
strengths and weaknesses of existing survey methods and state that this may 
be helpful to employer associations contemplating the conduct of wage 
surveys or seeking to improve their present methods and also to employers 
who seek to interpret and evaluate wage information they receive, to labor 
unions seeking to appraise the validity of the wage information obtained 
through employer association surveys, and to government analysts who may 
need standards for assessing wage information presented to them. Both 
producers and users of wage surveys will find a careful reading of Part II 
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of this monograph very worth while. It contains by far the best practica] 
discussion to date on the methodology of wage surveys. 

Among a number of weaknesses discussed are those of sampling and of 
methods of collection. It is the authors’ conclusion that the employer associa- 
tions appear to give very little attention to the selection of their wage survey 
samples. One may agree that “the objective should be that of securing 4 
representative and balanced sample”, but it should be pointed out that many 
associations have no way of obtaining adequate universe data in terms of 
individual establishments for a given area or industry. 

The authors believe the accuracy of many of the surveys and the uniform. 
ity of their occupational classification are open to question. They point out 
that nine-tenths of the wage surveys of employer associations are based on 
mail questionnaires and that more than half of those which seek occupational 
data solicit the information in terms of mere job titles without any standard- 
ized job descriptions. Attention is directed to the very wide range in wage 
rates for individual occupations which results from such procedures. 

It is implied that in many cases greater accuracy and narrower ranges are 
obtained in surveys where the data are collected directly by field visits using 
carefully prepared occupational descriptions, the procedure used by the U. §. 
Bureau of Labor Statistics. Unfortunately this in itself does not insure nar- 
row ranges, as is demonstrated by the results of the BLS Occupational Wage 
Surveys. 

Wage surveys can certainly be improved by better methods of sampling 
and collection, but it is this reviewer’s opinion that a great deal remains to 
be done in developing means of eliciting more accurate replies from re- 
spondents. Highly developed job descriptions in the hands of trained field 
agents help, but evidence indicates we must still strive to devise more effec- 
tive means of reducing response error even where we have field collection 
and job descriptions. 

The authors touch on this problem and discuss some possible solutions, 
but there is need for much additional work in this aspect of wage surveys. 


Revue de Statistique Appliquée. Volume 1, No. 1, 1953. Paris: Centre de For- 
mation des Ingénieurs et Cadres aux Applications Industrielles de la Statistique, 
Institut de Statistique de l’Université de Paris. Pp. 103. Paper. 


HIS new journal is the organ of the newly formed (1952) statistical center 
whose full title is given above. The objectives of the Centre are stated as 
follows by its general director, M. Georges Darmois: 


We want to make it peosnle for the leaders of French industry to train 
their personnel in the effective use of the statistical techniques which have 
so completely proved themselves in other countries. 

It is necessary on the other hand to pursue research so that new problems 
may be studied and so that a continuing relationship with the users of 
statistics may be maintained. (Page 8.) 
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The Centre offers two types of short courses in statistics for industrial 
personnel. The first, lasting from 10 to 15 days, requires no special mathe- 
matical training. The second lasts for three weeks and is designed for engi- 
neers. The first course emphasizes the methods of statistical quality control 
while the second provides a wider coverage of statistical topics. Both are 
oriented toward statistical inference. A detailed outline of contents is given 
in this first issue of the Revue, pp. 16-24. 

The Centre will also maintain a consulting service (Bureau d’Etudes du 
Centre) which will serve the firms and individual engineers who belong to 
the Centre and contribute to its financial support. The Revue, under the 
direction of M. E. Morice, has been established primarily as a liaison with 
the membership. It is to carry examples of statistical applications as well as 
news of statistical meetings and the like. 

For the first few years at least, the Revue will concentrate on discussions 
of the usefulness of statistics in different areas of business, backed up by 
numerous concrete examples. The first issue fits closely in this mold with a 
series of articles describing both general and specific applications in many 
parts of French industry. While there are articles on statistical quality con- 
trol, the applications cover a much wider field of business applications. For 
example there is a discussion of the organization of statistics with the firm, 
the application of statistics in the planning of capacity of equipment needed 
in power plants, description of an industrial experiment, and two articles on 
market research, one of which gives some very interesting data on taste- 
testing. There is an annotated bibliography describing four different books 
on statistics, all in French, which might be of interest to members of the 
Centre (pp. 97-99). 

The Revue hopes gradually to shift its emphasis from concrete applications 
to methodological articles which will be aimed at graduates of the short 
courses described above. It hopes also to publish applications of statistics 
by these graduates and by other readers, and makes an interesting appeal 
for submission of expériences malheureuses whenever a lesson can be learned 
therefrom. The Journal’s address is: 

Monsieur le Rédacteur en Chef de la «Revue de Statistique Appliquée» 
11, rue Pierre-Curie, Paris 5 éme, France. H.V.R. 


The Problem of Summation in Economic Science. A Methodological Study 
with Applications to Interest, Money and Cycles. Géran Nyblén. Lund Social 
Science Studies No. 4, Lund, C. W. K. Gleerup, 1951. Pp. xii, 289. 


Joun 8. Curpman, Harvard University 


| pene the title of this book, one might gain the impression that it is a 
study on index numbers. It is nothing of the sort. Broadly, it is no less 
than a critique of the foundations of modern economic theory, especially the 
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theory of distribution; specifically, it is the only attempt this reviewer has 
seen to confront the theory of games with empirical data. 

Nyblén opens in Chapter I with a discussion of “the fundamental idea that 
economic variables are produced by a mechanism, which can be described 
as a system of simultaneous equations” (pp. 5-6) and quotes as typical of g 
predominating contemporary viewpoint Samuelson’s assertion that “any 
sector of economic theory which cannot be cast into the mold of such a sys- 
tem must be regarded with suspicion as suffering from haziness” (p. 6). 

In the rest of the book Nyblén subjects this point of view to forceful 
criticism. He turns in Chapter IT to a discussion of specific economic models, 
dealing first with the Leontief system and criticizing it for being “very 
‘mechanical’ in the sense that no fundamental economic decisions are explic- 
itly tied to it...” (p. 21). It should properly be regarded, he says, as em- 
bedded in a linear programming system, which allows for choice among 
alternative production processes; but such a system can only be made deter- 
minate by a statement of social objectives (i.e., by maximization of an “ob- 
jective function”) and this implies the control of the economy by a single 
will, and therefore “can comprise no real treatment of a problem of distribu- 
tion” (p. 31). The conclusion is somewhat weakened by the subsequent 
publication of the “substitution theorem” of linear programming, but not 
invalidated.! 

Next, Nyblén discusses the marginal productivity theory of distribution, 
pointing out that marginal productivity does not determine the distributive 
shares, but only determines schedules of demand for factors and supply of 
products; distribution is then determined in Walrasian fashion by the 
interrelation of the demand and supply schedules of firms and households. 
If markets are competitive, the solution is determinate, and “the distribution 
process described is automatic, because no particular agreements of any kind 
between the decision units are needed for it to function, and it is ultra- 
harmonic, because no conflicts of interest are present and no unit has more 
influence on the price-determination than any other” (p. 41). On the other 
hand if monopolistic or imperfect markets are introduced, the system be- 
comes overdetermined; for, as Marschak has pointed out, the addition of 
sloping demand functions adds more equations to the system than unknowns, 
and these functions can therefore not be independent of one another if 
markets are to be cleared. Distribution is then left unexplained. 

From this impasse Nyblén is led to a consideration of the theory of games. 
He notes the distinction made by von Neumann and Morgenstern between 
“inessential games,” in which the payoff that goes to a set of players is al- 
ways equal to the sum of the amounts those players would receive when 
acting independently, and “essential games” in which the amount received 
by a coalition always exceeds the amount that its members could obtain 
independently. This is where “summation” comes in: the proposition that 





1T. C. Koopmans (ed.), Activity Analysis of Production and Allocation, New York, 1951, chapters 
VII-X. 
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“in general” a set of players can obtain more by coalescing than by acting 
independently is called the “first summation theorem.” In Nyblén’s words: 
“there is always one part of the national income the distribution of which 
can and must necessarily be settled through agreements between the members 
of society; the distribution of the national income can never be completely 
settled in an automatic and harmonious way.” (p. 77). The “generality” 
here, it should be noted, is purely formal, and Nyblén is to be criticized for 
not making sufficient distinction between the formal and the empirical. The 
fact that firms’ revenue functions are necessarily interdependent was partly 
recognized by Chamberlin, and it is curious that Nyblén includes no discus- 
sion of the former’s solution to the problem in Chapter V of the Theory of 
Monopolistic Competition. However it must be admitted that there is still 
considerable oligopolistic indeterminacy left in the general equilibrium of 
monopolistic competition, so that there is ample justification for Nyblén’s 
view that the system can be neither “automatic” nor “ultra-harmonic”. 

Next, Nyblén takes issue with the assumption of transferable utility, that 
is, with the postulate of von Neumann and Morgenstern that the utility lost 
by one player or set of players is equal to the utility gained by the remaining 
set. As von Neumann and Morgenstern were forced to admit, this boils down 
to the assumption that payoffs are in monetary terms and that players maxi- 
mize the expected value of monetary returns rather than utility. Nyblén 
takes issue with transferable utility on the basis of Arrow’s proposition that 
“in general” no social welfare function can be constructed from individual 
utility functions, so no common standard of value exists; this is the “second 
summation theorem”. He goes on to state: “If such a common preference 
scale exists there can be essentially no diversity of interests at all, and the 
distribution process can constitute no problem” (p. 95). This statement is 
rather extreme, for even if individual orderings of commodities are identical, 
so that a social welfare function can be established, utility is still not trans- 
ferable, that is, interpersonal comparisons still cannot be made. Furthermore, 
there is still room for struggle over distributive shares. Thus the second sum- 
mation theorem is rather a will-o’-the-wisp. 

Once we settle for monetary payoffs, the transferability assumption still 
leaves a serious problem: the constant-sum character of the game. In order 
to deal with non-zero- (or non-constant~) sum games von Neumann and Mor- 
genstern introduced, as is well-known, a fictitious n+1-th player who re- 
ceives (or pays) the difference; however, regarding as a “patent absurdity”? 
the notion that this player can make bribes, they limited the solutions of non- 
zero-sum games to discriminatory ones in which “nature” is allowed by the 
real players to receive only a specified amount—in the extreme case, only 
what it could obtain in isolation. As a result there is little to distinguish 
this from the constant-sum game. This strikes me as a principal weakness 
of the theory, and it is reflected in Nyblén’s treatment (p. 91) in which the 








2 John von Neumann and Oscar Morgenstern, The Theory of Games and Economic Behavior, Prince- 
ton, 1947, p. 513. 
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n real players cooperate in order to maximize their payoff from nature 
(this is the “production problem”) and then fight over the spoils (the “dis. 
tribution problem”). This interpretation neglects what is surely the distinc. 
tive feature of the economic “game”: the way in which tlie pie is distributed 
affects the size of the pie itself. Nyblén is conscious of the artificial nature 
of this dichotomy between production and distribution, and finally confesses 
that he is “not able to point out a synthesis between the two extremes” 
(p. 128). As a second-best solution, he concludes that the main features 
of the free economy are best analyzed in terms of distribution theory rather 
than by production theory—by the theory of games instead of by models 
which can be expressed in terms of systems of equations. 

As a first step in applying the theory of games, Nyblén tackles the aggrega- 
tion problem (pp. 57-64). Formally, there are great difficulties in game theory 
in aggregating players into indissoluble groups. Some readers may be un- 
satisfied (though intrigued) by his procedure of invoking “incomplete in- 
formation” and “ ‘irrational’ socio-psychological factors” in order to justify 
aggregation of players into groups (forming constant-sum subgames) which 
themselves obey the “rational” dictates of game theory; perhaps it would 
be better to treat these groups as “teams” in Marschak’s sense.‘ 

We come then to the empirical part of the book. In Chapter V the author 
divides the economy into four groups: workers, farmers, entrepreneurs, and 
capitalists or rentiers (he gives them the misleading name “savers”) and at- 
tempts to explain the share of the latter in the national income. He sets 
himself the task of explaining the close correlation between the price level 
and the rate of interest before 1932, and the gradual rise in the price level 
relative to the rate of interest after that date. Following a most interesting 
discussion of the Jevons-Wicksell theory of interest and its extension by 
von Neumann and Hawkins, Nyblén discusses the abstinence and Keynesian 
theories, concluding with the view (pp. 150-1) that both are rationalizations 
of economic events, the first prior to 1930 and the second subsequently. 
Taking as a measure of rentiers’ relative income share the ratio of the inter- 
est rate to the general price level (p. 169), Nyblén concludes from the data 
that rentiers’ share in national income was relatively constant before 1930 
and began to decline thereafter. As measures of “the” interest rate he chooses 
railroad bond yields and rediscount rate for the U.S., and yield of consols 
and Bank Rate for Britain. The choices are rather unfortunate owing to the 
fact that the ratio of interest rate to price level is a fairly accurate index of 
relative income of bondholders only in the case of short-term privately- 
held securities. Faced with declining yields on consols, widows and orphans 
can postpone their euthanasia indefinitely by holding on to them, and in the 
case of railway bonds can postpone the day of reckoning for a generation. 





3 The theory of non-constant-sum games is still in the early process of development. Cf. John F. 
Nash, Jr., “The Bargaining Problem,” Econometrica, April 1950, and Howard Raiffa, “Arbitration 
Schemes for Generalized Two-person Games,” Contributions to the Theory of Games, Vol. II, edited by 
H. W. Kuhn and A. W. Tucker, Princeton, 1953, pp. 361-87. 

4 Econometrica, July 1953, pp. 485-6. 
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Similarly, a rise in yields is of no temporary use to bondholders during a 
period of inflation (with consols, no use at all) unless their holdings are short- 
term. It is only in the long run that parallel fluctuations of interest rates and 
prices can indicate stability of interest-income. Perceiving however that 
there was, as we may grant, a marked change after 1930, Nyblén comes forth 
with the hypothesis (p. 166) that before 1930 independent central banks 
carried out policies designed to maintain stability in income shares, whereas 
after that date they lost their independence and came under the political 
control of laborers, farmers, and entrepreneurs. In the language of game 
theory, bondholders after 1930 became the “excluded player” in a now dis- 
criminatory four-person game. Thus, concludes Nyblén, “thetheory of games 
gives a theoretical structure capable of comprising such sharp changes, which 
is remarkably different from the potentialities of traditional economics” 
(p. 165). 

While Nyblén’s emphasis on the political determination of distributive 
shares is interesting, his claims for the theory of games are inadequate. 
There is nothing within the theory of games to explain the change from an 
objective to a discriminatory solution; this follows necessarily from the 
static nature of game theory. The change remains exogenous and unex- 
plained. The theory of games takes the “accepted standard of behavior” 
as given, while it is this that is mostly in need of explanation. Like a ward- 
robe which provides suits for all occasions, the theory of games can no doubt 
provide categories of solutions to fit all the possible facts; but no amount 
of study of that wardrobe will predict or even explain what its user will 
wear tomorrow. And even then the clothes are completely out of character 
with the wearer, and one cannot help feeling that they fit very uncomfortably. 

In Chapter VI Nyblén turns to the problem of international distribution 
of income. In the course of lengthy excursions into the quantity theory of 
money, the Patinkin controversy, and the purchasing power parity theory, 
he makes the following observations: that according to the quantity theory 
inflation leaves relative prices, and consequently the distribution of incomes, 
unchanged (Nyblén fails to stress the fact that this analysis is applicable 
only to a stationary economy) and that, if purchasing power parity is as- 
sumed as well, inflation leaves international distribution of income (measured 
in some currency) unchanged. It is curious that Nyblén does not discuss the 
inadequacy of such a measure of a country’s real income, even if the latter 
can be said to exist. He then asserts that the purchasing power parity and 
quantity theories were valid in some periods but not in others, and seeks a 
“theory of theories”. The latter turns out to be the theory of games with de- 
composable characteristic functions. These are constant-sum games in which 
the players are divided, say, into two sets (the sets will be countries) with the 
property that the amount that any group from one set can obtain together 
with any group of the other set is the same as the amount the two groups can 
obtain in isolation; in other words, there is no advantage to be gained from 
inter-country coalitions. In spite of this property the solutions of this game 
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are not necessarily decomposable, that is, it is not true “in general” that the 
sums going to each country are constant. Thus, if the players in one country 
fail to coalesce, they may fail to get their “due”, and a transfer—called an 
“Excess”—will then take place to the other country—a tribute without 
being a bribe. The case in which the tribute is zero is said to be “exceptional” 
(pp. 214-15), but later we are told (p. 220 )that its opposite seems “excep- 
tional” ; again the distinction between the formal and the empirical is blurred, 
and no attempt is made to give this tribute any interpretive meaning. As a 
result, a factitious hypothesis emerges: that “the observations consistent 
with the purchasing power parity theory imply the prevalence of a zero 
Excess between the countries studied” (p. 223) and “the observations show- 
ing successive changes of the relations between the national incomes com- 
pared imply the presence of a non-zero Excess” (p. 224). If this were all 
there was to the hypothesis, the same objections would hold as were pre- 
viously presented; but in this case there is an additional (but hardly startling) 
observation (pp. 224-7): A non-zero Excess comes about only if some region 
is not sufficiently integrated into a coalition; futhermore, once such an Excess 
has developed, one may expect a distribution struggle to ensue, taking the 
form of competitive inflationary movements. Nyblén makes special note of 
(1) the pre-1914 period, in which the purchasing power and quantity theories 
are said to be valid (apparently in the trivial sense that both exchange rates 
and relative price levels were steady), (2) the period of the early twenties 
with violent fluctuations in terms of trade, and (3) the period after the second 
World War and before Korea, characterized by a worstening of Europe’s 
terms of trade. The latter events he blames on Europe’s lack of integration, 
and the policy recommendations follow naturally. 

Nyblén finally turns, in Chapter VII, to an analysis of business cycles. 
He criticizes econometric models depicting cycles as oscillations around an 
equilibrium derived from difference or differential equations, and after dis- 
cussing the works of Schumpeter, Wiener, Domar, and Dahmén, comes 
forth with his own novel theory of cycles. To begin with, “economic progress 
and economic crises and depressions are most intimately connected” (p. 
266) for the following reasons: an innovation, taking the form of a specific 
investment, raises capital values and lowers capital values in other spheres 
of the economy, thus changing the “objective possibilities” of the situation 
(specifically, the characteristic function of the game); there ensues “a dis- 
tribution struggle which we believe to be the essence of crises and depres- 
sions” (p. 266) since it results in bankruptcies and a “breakdown of the pric- 
ing system” (p. 262). The depression is intensified by a distribution struggle 
among major social groups (in addition to conflict among businesses) brought 
about by political upheavals (p. 269n) and only brought to an end when the 
distribution struggle has been settled. Then the revival takes place, since the 
previous innovations had opened up profitable opportunities that had been 
neglected during the distribution struggle. 

Nyblén’s business cycle theory impresses me as being the least artificial 
of his hypotheses, and it is noteworthy that it is also the most original and 
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least tied down to the specifications of a particular game-theoretical construc- 
tion. 

An interesting question which emerges from this discussion is that of deter- 
minacy. Nyblén points out that the distribution struggle is settled through 
agreements and therefore not automatic and harmonious; but if its outcome 
can be predicted at all, is it not at least automatic? The question is not one 
of prediction with certainty versus prediction with a given probability, for 
Nyblé4n rejected stochastic systems-of-equations systems along with the 
rest (p. 5). A partial exit to this impasse might be found in the dynamic 
character of the model, for even if a distribution struggle is settled, a lot of 
time elapses during which the struggle goes on. He admits that the outcome 
“could never be uniquely predictable” (p. 265), yet appears to be committed 
in principle to a belief in the ultimate predictability (at least in a stochastic 
sense) of socio-economic events. The answer seems to be that a theory is 
non-automatic and non-harmonic only if the phenomena it describes are not 
determinate within the economic system, but only within a wider universe; 
and in this wider universe, it appears that hypotheses cannot be expressed 
in terms of systems of equations, but must find some other, qualitative, 
expression. It has been suggested to me (by D. Ellsberg) that the kind of 
prediction that Nyblén and other game theorists may have in mind consists 
of a narrowing down of the class of possible solutions; thus one might be 
able to predict the range of a variable without any specification of a proba- 
bility distribution over that interval. 

Nyblén has done economists a service by attempting to apply the theory 
of games to the facts, but the results cannot be considered conclusive, as the 
theory is imperfect and the statistical methods are crude. Moreover the 
analysis is frequently marred by misplaced concreteness. More important, 
however, is his insistence on the role of political and social phenomena in the 
explanation of economic events. His study remains an exploration into an 
as yet little-known world. It is to be hoped that his work will stimulate others 
jnto seeking answers to some of the fundamental questions he raises. 


Measurement of Productivity. Organisation of European Economic Cooperation. 
Paris: 1952. (U.S. Distribution Agent: Columbia University Press). Pp. 104. 
$1.25, Paper. 


Peter O. STEINER, University of California (Berkeley) 


Many Americans have had the opportunity of meeting with members of 
one or another of the groups of foreign visitors who have come to the United 
States under the sponsorship of the technical assistance program of Euro- 
pean Cooperation Agency. This thin monograph is a report on what was 
learned about methods of productivity measurement by the members of 
three such groups who visited the United States in 1950 to study the produc- 
tivity division of the Bureau of Labor Statistics. Each mission spent five or 
six weeks in Washington listening to lectures by BLS department heads, and 
four weeks in the field visiting industrial firms, universities, and regional 
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offices of BLS. The report seems to be largely a condensed summary of the 
notes they took. 

Americans will be interested in the report chiefly for appraising whether 
missions of this sort are worthwhile methods of communication. I offer no 
opinion on this subject. The groups felt “the visit to the United States 
proved of great value both for the discussions and exchanges of views which 
it made possible, and for the cordial relations established between the rep- 
resentatives of the Member countries,” and urged that further missions be 
organized in the future. 

The material covered includes: history and organization of the BLS; uses 
of productivity measures; problems in defining and measuring input, out- 
put, and productivity; procedures for collection of data by direct inquiry 
and from secondary sources; and appendixes containing sample question- 
naires, lists of data ‘available, and methods for computation of indexes, 
Since the text is very short (about 30 pages), it is evident that treatment of 
each topic is brief; actually brevity approaches superficiality on most points. 
From the point of view of content, more systematic and adequate treat- 
ment of the issues is available in many publications; see, for example, the 
International Labor Office, Methods of Labour Productivity Statistics, (Ge- 
neva, 1951), or for a more technical treatment, Irving Siegel, Concepts and 
Measurement of Production and Productivity, (Washington, 1952). 

One apparent purpose of the report is to provide information to the Euro- 
pean countries from which the missions were drawn that might be useful 
to them in establishing, revising, or expanding their programs of productivity 
measurement. On most technical issues, as previously suggested, alternative 
sources will be more helpful. The report does a service, however, by warn- 
ing that BLS methods cannot be transferred without being carefully investi- 
gated and adapted. One of my colleagues tells of the time he requested his 
students in an examination to visualize themselves as top executives and 
indicate how they would solve a particular problem he then set before them. 
One paper was turned in almost immediately. The student had written “I 
would hire you as a consultant.” In much the same way one feels that the 
report implicitly urges any government considering productivity measure- 
ment to hire a BLS statistician as a consultant. This, of course, is sound ad- 
vice. 


Concepts and Measurement of Production and Productivity. Irving H. Siegel. 
Washington: U. 8. Bureau of Labor Statistics, 1952. Pp. 108. Paper. 


Artuur L. Brora, Board of Governors of the Federal Reserve System 


Irving Siegel has been concerned for many years with the statistical meas- 
urement of production and productivity, both as a practitioner at the WPA 
National Research Project and the Bureau of Labor Statistics and as a stu- 
dent. This study is in good part a synthesis of the views originally set forth 
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by him in this Journat and elsewhere. It has been reproduced as a working 
paper of the National Conference on Productivity, and can be obtained from 
the BLS. 

Between the introduction and summary are four substantive chapters, 
one dealing with concepts and three with technical matters. The most valua- 
ble material is in the technical sections. Among the subjects investigated 
are the relationships between alternative indexes of production (e.g., Paasche 
and Laspeyres) and productivity (e.g., indexes derived by relating employ- 
ment measures to value-weighted and labor requirement-weighted quantity 
indexes); the nature of aggregates; directly calculated indexes and those de- 
rived by deflation; the relationships among indexes of gross output, net out- 
put, and materials consumption; alternative formulations of given indexes; 
coverage adjustments; and the decomposition, or “partitioning,” of changes 
in aggregates into additive contributions of various elements. 

The notion of the multiplicity of legitimate measures of “production” and 
“productivity,” presented in the introduction, is properly given strong em- 
phasis. Also useful is the review of the meanings given these terms in the 
literature of economic theory, national income, and index numbers, which 
is found in the chapter on concepts. The summary chapter contains some 
interesting proposals for research. 

Questions may be raised concerning certain of the main ideas presented. 
A great deal of space is given to the “multiperiod macrotype,” a notion in- 
tended to rationalize the numerical comparisons given by indexes. The 
author observes that value theory permits only ordinal comparisons, and 
these only under highly restrictive assumptions of constancy in tastes, tech- 
nology, etc., so that the usual production and productivity indexes “do not 
have any ‘economic’ import.” The solution is to imagine something called 
the macrotype (also referred to as a “fictional creature,” a “decision maker,” 
a “mythical appraiser,” a “generalized consumer equally at home in all pe- 
riods,” and a “personification of a formula”) whose “relevant behavior is 
not ‘economic’ in the ordinary sense but is described by the specific content 
and structure of the index”—i.e., in whose eyes the index is numerically sig- 
nificant—and then “to judge its plausibility.” 

Apparently there are separate macrotypes for indexes of every possible 
content and structure. The author does not discuss the bases on which their 
relative “plausibilities” are to be determined, but presumably they would 
be the same as are relevant in evaluating the indexes directly. It is not clear, 
therefore, that the interjection of the macrotype greatly facilitates matters. 
The author believes that “The notion of the ‘macrotype’... dramatizes 
the value judgments that underlie numerical comparisons” (page 10) and 
“Without some such conception we should probably have to abandon at- 
tempts to measure changes in the ‘physical volume’ of the physically chang- 
ing goods of an advanced industrial society” (page 39). 

Under a proposal for a “sub-product” approach to index construction, in- 
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dex makers are urged to classify their data in terms of sub-products of the 
vertical stages through which each end-product passes. The sub-products 
would be measured in characteristic units and assigned incremental weights, 
This procedure is advanced as preferable to making indexes from end-prod- 
uct data for various reasons, most of which can be summarized under the 
headings of greater accuracy and greater flexibility for analysis. 

The sub-product approach, in effect, is already used in the production in- 
dexes of most countries. These indexes follow an “industry” organization, 
with industries separated from one another both horizontally and vertically. 
Thus, there usually are separate industries, and separate index components 
with incremental (value added) weights, for iron ore, pig iron, and steel. 
To extend the method further would require that the “industry” categories 
now used be refined vertically into smaller elements. Undoubtedly this could 
be done in some lines, and it would be desirable to carry it as far as possi- 
ble—particularly where inter-stage inventories are customarily accumulated, 
so that operating rates can differ from one stage to the next. However, in 
most industries as presently defined in the United States any further vertical 
refinement would mean separating successive processes within individual 
plants. With a few exceptions, the difficulties of reporting quantities and, to 
a greater extent, values, would multiply very quickly. 

The author says that the necessary “reorientation of Federal and other 
statistical reporting systems on a grand scale seems very unlikely,” but 
seems to think that it is feasible, and may be undertaken “after some dis- 
illusionment” with present measures. The feasibility of such a large-scale 
program in the foreseeable future is extremely dubious, and if this is the 
“key to substantial further progress,” substantial progress is improbable. 
But there are many keys to progress. One, for example, would be to continue 
to fill out the list of products for which reliable current output data are pos- 
sible. Significant advances have been made, but the gaps still remaining are 
important. Even apart from feasibility questions, this project might well be 
given precedence over the collection of sub-product data. Progress can also 
be made in other important ways, such as refining industries horizontally— 
that is, estimating the value added weight for a product class from census 
data for plants concentrating on that class of product. While this, too, has 
definite limits (some commodities are almost always produced in associa- 
tion with certain others) it would require only more effective exploitation 
of existing data and not collection of additional data. Also, it would be use- 
ful to supplement the usual indexes, covering successive stages and employ- 
ing incremental weights, with value-weighted end-product measures, con- 
fined, say, to finished consumers’ goods and producers’ equipment. Such 
end-product measures would serve many purposes not adequately served by 
present indexes. 

In another recommendation, “free-composition” indexes are advanced as 
preferable to the “customary” chain indexes for resolving “the problem of 
discontinuity of product series due to changes in classification, specifica- 
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tions, and variety of goods made and reported” (page 70). The discussion 
turns on only one of these points—changes in the variety of goods made. 
To handle this problem, which is largely one of “new” products, the author 
proposes writing zero’s for periods when the production of a particular com- 
modity was zero, and inventing a hypothetical price for weighting purposes 
if production was zero in the weight year. This reasonable solution to the new 
product problem has in fact already been used explicitly or implicitly in a 
number of instances. 

This alone, however, would not seem to justify condemning “the ritual of 
shifting the time base, the weights, and the product classes, and then chain- 
ing the links” as superfluous “acrobatics.” Weights are usually changed pe- 
riodically so that they may continue to be reasonably representative of cur- 
rent relationships. The resulting separate links are chained together to avoid 
breaks in the series. The question of how frequently weights should be 
changed is certainly subject to debate, as it involves a compromise between 
the gain in relevance for some comparisons resulting from recent weights 
and the difficulties of interpretation that linking introduces. The author 
does not comment directly on this subject, leaving the implication that in 
his view weights should not be changed, or, perhaps, that differently-weighted 
segments should not be linked together. Apart from weight changes, the 
linking process is often used for individual components of an index to meet 
some of the other problems the author originally lists, including changes in 
classifications used for the reported data, and changes in the variety of goods 
for which figures are reported. It is not made clear how these difficulties can 
be met by “free-composition” measures. 

Other questions may be noted. There are many references to the subject 
of “externality,” a condition which exists when an “average” lies outside 
the range of the terms being averaged. But what all the discussion is about 
is not apparent. For some of the cases where this “danger” is pointed out— 
e.g., productivity measures computed from value-weighted production in- 
dexes and labor input indexes—the relationship is not an average at all. In 
this instance the author himself demonstrates, on page 54, that the ratio 
is equivalent to the product of an average and another term. In another 
case—that of coverage adjustments—the problem as posed would appear to 
be more accurately described as a possibly incorrect assumption rather than 
possible “externality.” 

The treatment of coverage adjustments is hardly adequate. The discus- 
sion and evaluation are confined to one of the two problems such adjust- 
ments are designed to meet, that of representing quantity changes for prod- 
ucts reported in value terms only. The existence of the other problem—that 
of eliminating from an industry measure the output of industry-type prod- 
ucts actually made elsewhere, and included in the quantity data—is merely 
mentioned in a footnote. The author observes on page 63 that while the 
adjustment rests on a specific assumption (similar average price changes 
for two sets of goods), the use of unadjusted measures also implies an as- 
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sumption (similar average quantity changes for these goods). But three 
pages later he reports with apparent approval that both the WPA and the 
BLS rejected the coverage adjustment because, among other reasons, they 
preferred not to introduce an “ad@’’‘onal” assumption. A hypothetical ex- 
ample is used to demonstrate that adjusted indexes may yield poorer results 
than unadjusted measures in the case of new products unreported in quan- 
tity terms (page 67). Calculation indicates, however, that with the prices 
and quantities assumed in the example the “new” product accounts for about 
43 per cent of the industry’s value of output before its growth and 25 per 
cent after. With more realistic figures adjusted indexes would usually be 
found to understate the growth of new products, but by substantially smaller 
amounts than unadjusted measures. The question of new products, inci- 
dentally, while intriguing, seems to be greatly overrated as a practical prob- 
lem, at least for the United States, in this study and elsewhere. 

The author catalogues the sins of index users, and asserts that index mak- 
ers prefer comfortable tradition to “the search for promising new paths.” 
The only basis offered for this assertion is the index makers’ continued ne- 
glect of Mr. Siegel’s proposals. Also, in connection with one recent innova- 
tion touched on in the text, the author’s information is incorrect: “To over- 
come objections to publication raised by interested groups aware of the 
practical consequences of a few percentage points, the U. S. Bureau of Cen- 
sus has been obliged to release three differently weighted 1947 indexes, not 
merely one, for each manufacturing industry” (pages 6-7). Mr. Siegel and 
his readers will be glad to know that the decision in the 1939-1947 bench- 
mark index project to compile and publish six alternative indexes for each 
industry (under the three weighting systems, with and without coverage 
adjustments) was made in the earliest planning stage of the work, and with- 
out reference to the kind of considerations suggested. 

Emphasis has been given here to some of the more controversial aspects 
of the study, but as has been noted it contains much useful and instructive 
material. The volume comes at a time when interest in the field is high, 
with old indexes undergoing revision and new ones being developed in many 
countries. 
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